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Abstract 

We study the performance of adaptive Fourier-Galerkin methods in a periodic box in M. d 
with dimension d > 1. These methods offer unlimited approximation power only restricted 
by solution and data regularity. They are of intrinsic interest but are also a first step towards 
understanding adaptivity for the hp-FEM. We examine two nonlinear approximation classes, 
one classical corresponding to algebraic decay of Fourier coefficients and another associated 
with exponential decay. We study the sparsity classes of the residual and show that they 
are the same as the solution for the algebraic class but not for the exponential one. This 
possible sparsity degradation for the exponential class can be compensated with coarsening, 
which we discuss in detail. We present several adaptive Fourier algorithms, and prove their 
contraction and optimal cardinality properties. 

Keywords: Spectral methods, adaptivity, convergence, optimal cardinality. 

1 Introduction 



Adaptivity is now a fundamental tool in scientific and engineering computation. In contrast 
to the practice, which goes back to the 70's, the mathematical theory for multidimensional 
problems is rather recent. It started in 1996 with the convergence results by Dorfler [13] and 
Morin, Nochetto, and Siebert |18j . The first convergence rates were derived by Cohen, Dahmen, 
and DeVore [7] for wavelets in any dimensions d, and for finite element methods (AFEM) by 
Binev, Dahmen, and DeVore |2j for d = 2 and Stevenson [21] for any d. The most comprehensive 
results for AFEM are those of Cascon, Kreuzer, Nochetto, and Siebert [6] for any d and I? data, 
and Cohen, DeVore, and Nochetto [8] for d = 2 and H -1 data; we refer to the survey [19] 
by Nochetto, Siebert and Veeser. This theory is quite satisfactory in that it shows that AFEM 
delivers a convergence rate compatible with that of the approximation classes where the solution 
and data belong. The recent results in [8] reveal that it is the approximation class of the solution 
that really matters. In all cases though the convergence rates are limited by the approximation 
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power of the method (both wavelets and FEM), which is finite and related to the polynomial 
degree of the basis functions, and the regularity of the solution and data. The latter is always 
measured in an algebraic approximation class. 

In contrast very little is known for methods with infinite approximation power, such as 
those based on Fourier analysis. We mention here the results of DeVore and Temlyakov |12| for 
trigonometric sums and those of Binev et al pQ for the reduced basis method. A close relative 
to Fourier methods is the so-called p- version of the FEM (see e.g. [2D] and [5]), which uses 
Legendre polynomials instead of exponentials as basis functions. The purpose of this paper is 
to present adaptive Fourier- Galerkin methods (ADFOUR), and discuss their convergence and 
optimality properties. We do so in the context of both algebraic and exponential approxima- 
tion classes, and take advantage of the orthogonality inherent to complex exponentials. We 
believe that this approach can be extended to the p-FEM. We view this theory as a first step 
towards understanding adaptivity for the ftp-FEM, which combines mesh refinement (/i-FEM) 
with polynomial enrichment (p-FEM) and is much harder to analyze. 

Our investigation reveals some striking differences between ADFOUR and AFEM and wavelet 
methods. The basic assumption, underlying the success of adaptivity, is that the information 
read in the residual is quasi-optimal for either mesh design or choosing wavelet coefficients for 
the actual solution. This entails that the sparsity classes of the residual and the solution coin- 
cide. We briefly illustrate below, and fully discuss later in Sect. [5j that this basic premise is 
false for exponential classes even though it is true for algebraic classes. Confronted with this 
unexpected fact, we have no alternative but to implement and study ADFOUR with coarsening 
for the exponential case; see Sect. [6] and Sect. El This was the original idea of Cohen et al [7] 
and Binev et al [2] for the algebraic case, but it was subsequently removed by Stevenson |21j . 

We give now a brief description of the essential issues we are confronted with in designing 
and studying ADFOUR. To this end, we assume that we know the Fourier representation v = 
{vk}kez of a periodic function v, and its non-increasing rearrangement v* = {t>*}^ =1 , namely, 
< \Vn\ f° r an n — 1- 

Dorfler marking and best A-term approximation. We recall the marking introduced by 
Dorfler [13], which is the only one for which there exist provable convergence rates. Given a 
parameter 6 6 (0, 1), and a current set of Fourier frequencies or indices A, say the first N ones 
according to the labeling of v, we choose the next set dA as the minimal set for which 



where r := v — Pa v is the residual and Pa is the orthogonal projection in the £ 2 -norm || • || onto 
A. Note that, if r* := r — Pg\r and A* := A U dA, then (jl.ip can be equivalently written as 



and that r = v|a<= where A c := N\A is the complement of A and likewise for r*. This is the 
simplest possible scenario because the information built in r is exactly that of v. Moreover, 
v — r = {v* t }n = i is the best A-term approximation of v in the £ 2 -norm and the corresponding 
error En(v) is given by 



PdAr\\ > 6\\r 



(1.1) 



r-P dA r\\ < y/l-0 2 \\r 



(1.2) 




(1.3) 



J1>AT 
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Algebraic vs exponential decay. Suppose now that v has the precise algebraic decaj0 

|u*|~n~r Vn>l. (1.4) 

with 

- = 'n + l w 

t a 2 

and s > 0. We denote by ||v||^ the smallest constant in the upper bound in (II. 4D . We thus 
have 

E N (v) 2 ~ [|v[||r ^2 = ll V lll s E n ~~^~ 1 — \\ V \\l s AT" 7. 

n>iV n>N 

This decay is related to certain Besov regularity of v [12]. Note that the effect of Dorfler marking 
(|1.2p is to reduce the residual from r to r* by a factor a = yl — 9 2 , or equivalently 

EnA v ) < aE N (v), 

with JV* = |A*|. Since the set A* is minimal, we deduce that Etf+-i(v) > <xEn(v), whence 

N* d d 

--^~a~ — N ~ a~~s N (1.6) 

for a small enough. This means that the number of degrees of freedom to be added is proportional 
to the current number. This simplifies considerably the complexity analysis since every step adds 
as many degrees of freedom as we have already accumulated. 

The exponential case is quite different. Suppose that v has a genuinely exponential decay 

Kl^e - "" Vn>l, (1.7) 

corresponding to analytic functions |14| . and let ||v||^ be the smallest constant appearing in 
the upper bound in (ll.7p . These definitions are slight simplifications of the actual ones in Sect. 
3] but enough to give insight on the main issues at stake. We thus have 



En(v? ^ \\Ali E 

n>N 



e-^ n ^\\ V \\le-^ N ; 

G * — «" G 



this and similar decays are related to Gevrey classes of C°° functions [14j. In contrast to (II. 6p . 
Dorfler marking now yieldtH 

AT* -N ~ -log-. (1.8) 
77 a 

This shows that the number of additional degrees of freedom per step is fixed and independent 
of N, which makes their counting as well as their implementation a very delicate operation. 

Plateaux. We now consider a situation opposite to the ideal decay examined above. Suppose 
that the first K > 1 Fourier coefficients of v are constant and either 



v||^n t or |«*| = [|v||^ne vn \/n>K, (IS 



1 Throughout the paper, A < B means A < cB for some constant c > independent of the relevant parame- 
ters in the inequality; A ~ B means B < A < B. 

2 Throughout the paper, A ~ B means A — B + c for some quantity c ~ 1. 
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for each approximation class. A simple calculation reveals that either 

1 1 v 1 1 ~ \\\\\ l s B K" s,d or 1 1 v 1 1 ~ ||v||^e"^. (1.10) 
Repeating the argument leading to (|1.6p and (|1.8p with N = 1, we infer that either 

N*~Ka~^ or JV„ ~ K + - log -. (1.11) 

rj a 

For K ^> 1 this is a much larger number than the optimal values (|1.6|) and (|1.8p . and illustrates 
the fact that the Dorfler condition (jl.ip adds many more frequencies in the presence of plateaux. 
We note that K is a multiplicative constant in the left of (jl.lip and additive in the right of 

Sparsity of the residual. In practice we do not have access to the Fourier decomposition of v 
but rather of the residual r(v) = f — Lv, where / is the forcing function and L the differential 
operator. Only an operator L with constant coefficients leads to a spectral representation with 
diagonal matrix A, in which case the components of the residual r = f — Av are directly 
those of f and v. In general A decays away from the main diagonal with a law that depends 
on the regularity of the coefficients of L; we will examine in Sect. 12.41 either algebraic or 
exponential decay. In this much more intricate and interesting endeavor, studied in this paper, 
the components of v interact with entries of A to give rise to r. The question whether Lv 
belongs to the same approximation class of v thus becomes relevant because adaptivity decisions 
are made with r(v), and thereby on the range of L rather than its domain. 

We now provide insight on the key issues at stake via a couple of heuristic examples; we 
discuss this fully in Sect. 15. H and Sect. 15.21 We start with the exponential case: let v := {ffcjfcez 
be defined by 

Vk = e~ vn if k = 2p{n — 1), v\. = otherwise, 

for p > 2 a given integer and n > 1. This sequence exhibits gaps of size 2p between consecutive 
nonzero entries for k > 0. Its non-decreasing rearrangement v* = {v n}^Li 1S thus given by 

v* n = e-« n n>l, 

whence v £ with ||v||^ = 1. Let A := (aij)f^ =1 be the Toeplitz bi-infinite matrix given by 

a,ij = 1 if \i — j\ < q, aij = otherwise, 

with 1 < q < p. This matrix A has 2q + 1 main nontrivial diagonals and is both of exponential 
and algebraic class according to the Definition 12.11 below. The product Av is much less sparse 
than v but, because q < p, consecutive frequencies of v do not interact with each other: the 
i-th component reads 

(Av), = e _r?n if \i — 2p(n — 1)| < q for some n > 1, 

or (Av)j = otherwise. The non-decreasing rearrangement (Av)* of Av becomes 

(Av); = e-' |n if (2g + l)(n-l) + l<m<(2g + l)n. 

Consequently, writing (Av)^ = e~ v ™ m and observing that 

n n 1 

m ~ (2q + l)n ~ 2q + 1 
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and the equality is attained for m = (2q + l)n, we deduce 



Av G £% with ||Av|U =1 V = — • 

u g 2q + l 

We thus conclude that the action of A may shift the exponential class, from the one characterized 
by the parameter r/ for v to the one characterized by fj < 7] for Av. This uncovers the crucial 
feature that the image Av of v may be substantially less sparse than v itself. In Sect. 15.21 we 
present a rigorous construction with a^j decreasing exponentially from the main diagonal and 
another, rather sophisticated, construction that illustrates the fact that the exponent r = 1 
in the bound |t>*| < e~ r,n = e~' nnT for v may deteriorate to some f < 1 in the corresponding 
bound for Av. 

It is remarkable that a similar construction for the algebraic decay would not lead to a change 
of algebraic class. In fact, let v = {vk}kez be given by 

Vk = — if k = 2p(n — 1) for some n > 1, 
n 

and Vk = otherwise. The non-decreasing rearrangement v* = {u*}^ =1 of v satisfies v * = i 
whence 

v £ 1% with s = — llvlLa = 1. 
ts 2 B 

On the other hand, the i-th component of Av reads 

1 

(Av)j = — if \i — 2p(n — 1) < q for some n > 1, 
n 1 1 

or (Av)j = otherwise. The non-decreasing rearrangement of (Av)* in turn satisfies 

(Av)*=- if C2q + l)(n-l) + l<m<(2q + l)n, 
n 

whence writing (Av)^ = an< ^ ar g um g as before we infer that 

Av G t B with || Av||^ = 2q + 1. 

Since || Av||^ > ||v||^ we realize that Av is less sparse than v but, in contrast to the exponential 
case, they belong to the same algebraic class i s B . Moreover, we will prove later in Sect. 15.11 that 
A preserves the class £ S B provided entries of A possess a suitable algebraic decay away from the 
main diagonal. 

Since Dorfler marking is applied to the residual r, it is its sparsity class that determines the 
degrees of freedom |<9A| to be added. The same argument leading to either (jl.6p or (jl.8p gives 

/ llrll/ 8 \ - 1 \\ r \\e v 

|dA|<(^f) s +l or I0AI < ~]og—fZ + l, 
V a||r|| / r] « 1 1 r 1 1 

for each class. We thus see that the ratios ||r||^/||r|| and ||r||^/||r|| control the behavior of the 
adaptive procedure. This has already been observed and exploited by Cohen et al [7j in the 
context of wavelet methods for the class t s B . Our estimates, discussed in Sect. [5j are valid for 
both classes and use specific decay properties of the entries of A. 

Coarsening. Ever since its inception by Cohen et al [7] and Binev et al [2], this has been 
a controvertial issue for elliptic PDE. It was originally due to the lack of control on the ratio 
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||r||^/||r|| for large s [7J. It was removed by Stevenson et al [161 I21| for the algebraic class £ S B 
via a clever argument that exploits the minimality of Dorfler marking. This implicitly implies 
that the approximation classes for both v and Lv coincide, which we prove explicitly in Sect. 
15.11 for the algebraic case. This is not true though for the exponential case and is discussed in 
Sect. 15.21 For the latter, we need to resort to coarsening to keep the cardinality of ADFOUR 
quasi-optimal. To this end, we construct an insightful example in Sect. [6] and prove a rather 
simple but sharp coarsening estimate which improves upon [7j. 

Contraction constant. It is well known that the contraction constant p(6) = yT — 
cannot be arbitrarily close to 1 for estimators whose upper and lower constants, a* > a*, do not 
coincide. This is, however, at odds with the philosophy of spectral methods which are expected 
to converge superlinearly (typically exponentially). Assuming that the decay properties of A 
are known, we can enrich Dorfler marking in such a way that the contraction factor becomes 



This leads to p(8) as close to 1 as desired and to aggressive versions of ADFOUR discussed in 
Sect. 

This paper can be viewed as a first step towards understanding adaptivity for the /ip-FEM. 
However, the results we present are of intrinsic interest and of value for periodic problems with 
high degree of regularity and rather complex structure. One such problem is turbulence in a 
periodic box. Our techniques exploit periodicity and orthogonality of the complex exponentials, 
but many of our assertions and conclusions extend to the non-periodic case for which the natural 
basis functions are Legendre polynomials; this is the case of the p-FEM. In any event, the study 
of adaptive Fourier-Galerkin methods seems to be a new paradigm in adaptivity, with many 
intriguing questions and surprises, some discussed in this paper. In contrast to the /i-FEM, they 
exhibit unlimited approximation power which is only restricted by solution and data regularity. 

We organize the paper as follows. In Sect. [2] we introduce the Fourier-Galerkin method, 
present a posteriori error estimators, and discuss properties of the underlying matrix A for 
both algebraic and exponential approximation classes. In Sect. Owe deal with four algorithms, 
two for each class, and prove their contraction properties. We devote Sect. 0] to nonlinear 
approximation theory with an emphasis on the exponential class. In Sect. [5] we turn to the 
study of the sparsity classes for the residual r along the lines outlined above. We examine the 
role of coarsening and prove a sharp coarsening estimate in Sect. El We conclude with optimality 
properties of ADFOUR for the algebraic class in Sect. [7J and for the exponential class in Sect. 
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Fourier-Galerkin approximation 



2.1 Fourier basis and norm representation 

For d > 1, we consider Q = (0, 27r) d , and the trigonometric basis 






k 



k 
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be the expansion of any v G L 2 (Q) and the representation of its norm via the Parseval identity. 
Let H^(n) = {v G if 1 (ft) : v(x + 2irej) = v(x) 1 < j < d}, and let # p x (^) be its dual. Since 
the trigonometric basis is orthogonal in Hj;(Q) as well, one has for any v G Hp(Q) 

\Mm p( n) = I> + l fc l 2 )l^l 2 = E l^l 2 ' ( settin s ^ := V(l + \k\ 2 )h) ; (2.1) 
p k k 

here and in the sequel, \k\ denotes the Euclidean norm of the multi-index k. On the other hand, 
if /G H~ l {Vt), we set 

A = (/, <t>k) , so that (/, u) = ^2 fkVk Vt; G #p(0) ; 

the norm representation is 

ll/l| 2 Hl -^=E(n^lA| 2 = E™ 2 . (setting F t := -^==/ t ) . (2.2) 

Throughout the paper, we will use the notation || . || to indicate both the i^(f2)-norm of a 
function v, or the i?~ 1 (r2)-norm of a linear form /; the specific meaning will be clear from the 
context. 

Given any finite index set A C 7j d , we define the subspace of V := Hp(U) 

V A := span{^ fc | k G A} ; 

we set |A| = card A, so that dimVA = |A|. If g admits an expansion g = J^kdk&k (converging 
in an appropriate norm), then we define its projection P\g upon V\ by setting 

P\9 = ^2 9k4>k ■ 
keA 

2.2 Galerkin discretization and residual 

We now consider the elliptic problem 

I Lu = — V • (vVu) + au = f in £1 , 



u 2-7r-periodic in each direction , 



(2.3) 



where v and a are sufficiently smooth real coefficients satisfying < < v(x) < v* < oo and 
< cr* < a(x) < a* < co in ft; let us set 

q* = min(i/*,cr*) and a* = max(i/ , cr*) . 

We formulate this problem variationally as 

u€flj(«) : a(u,v) = (f,v) V.S^fO), (2.4) 

where a(u, v) = f n vVu ■ Vv + f n auv (bar indicating as usual complex conjugate). We denote 
by I u I = y/a(v,v) the energy norm of any v G Hp(Q), which satisfies 

Vo^\\v\\ < \\v\\ < Va*\\v\\ . (2.5) 
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Given any finite set A C Z d , the Galerkin approximation is denned as 



ua € Va : a(uA, va) = (/, v A ) Vtja G Va . (2.6) 
For any w G Va, we define the residual 

r(w) = / -Lw = y^f fc (w)0fc , where r k (w) = (f - Lw,(f> k ) = (f,(f) k } - a(w,<j) k ) . 
k 

Then, the previous definition of ua is equivalent to the condition 

P A r(u A ) = 0, i.e., r k {u A ) = VA; G A . (2.7) 

On the other hand, by the continuity and coercivity of the bilinear form a, one has 

— \\r{uA)\\ < \\u - u A \\ < — \\r{uA)\\ , (2.8) 
a* a* 

or, equivalently, 

' '|r(« A )|| < \\u - u A \l < -L||r(v A )|| • (2.9) 



a* V a * 
2.3 Algebraic representations 

Let us identify the solution u = ^2 k u k cj) k of Problem (|2,4p with the vector u = (U k ) = (c k u k ) G 
C^ d of its -normalized Fourier coefficients, where we set for convenience c k = -^/l + \k\ 2 . 
Similarly, let us identify the right-hand side / with the vector f = {Fg) = (cj 1 ff) G C zd of 
its H~ ^^-normalized Fourier coefficients. Finally, let us introduce the bi-infinite, Hermitian and 
positive-definite matrix 

A = (ap k ) with aik = a{^ k ^ t ) . (2.10) 

cgc k 

Then, Problem ()2.4p can be equivalently written as 

Au = f. (2.11) 

We observe that the orthogonality properties of the trigonometric basis implies that the matrix 
A is diagonal if and only if the coefficients v and a are constant in Q. 

Next, consider the Galerkin problem (|2.6|) and let ua G C' a ' be the vector collecting the 
coefficients of ua indexed in A; let fA G C' A ' be the analogous restriction for the vector of the 
coefficients of /. Finally, denote by Ra the matrix that restricts a bi-infinite vector to the 
portion indexed in A, so that Ea = R-a is the corresponding extension matrix. Then, setting 

A A = RaAR^ , (2.12) 

Problem (12. 6ft can be equivalently written as 

A A u A = f A • (2.13) 
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2.4 Properties of the stiffness matrix 

It is useful to express the elements of A in terms of the Fourier coefficients of the operator coef- 
ficients v and a. Precisely, writing u = J2k^k4>k and a = Ylk&k&k and using the orthogonality 
of the Fourier basis, one easily gets 

vt-k + <rt-k ■ (2-14) 



(2ir)< 1 / z \C£C k c e c k 
Note that the diagonal elements are uniformly bounded from below, 

ae,i > -^r^ min(z> , a ) > , £ 6 Z d , (2.15) 

whereas all elements are bounded in modulus by the elements of a Toeplitz matrix, 

W, k \ < -—-r^(\i>e-k\ + \&t-k\) , £,keZ d , (2.16) 

which decay as \£ — k\ — > oo at a rate dictated by the smoothness of the operator coefficients. 
Indeed, if v and a are sufficiently smooth, their Fourier coefficients decay at a suitable rate and 
this property is inherited by the off-diagonal elements of the matrix A, via f|2. 16|) . To be precise, 
if the coefficients v and a have a finite order of regularity, then the rate of decay of their Fourier 
coefficients is algebraic, i.e. 

\bkWk\<(l + \k\T v VkZ d , (2.17) 

for some r\ > 0. On the other hand, if the operator coefficients are real analytic in a neighborhood 
of ft, then the rate of decay of their Fourier coefficients is exponential, i.e. 

<e -?7|fe| VkeZ d . (2.18) 

Correspondingly, the matrix A belongs to one of the following classes. 

Definition 2.1 (regularity classes for A) A matrix A is said to belong to 

• the algebraic class T> a {r}L) if there exists a constant cl > such that its elements satisfy 

\a e ,k\ <c L {l + \£-k\)-^ £,k£Z d ; (2.19) 

• the exponential class T> e {j]i) if there exists a constant cl > such that its elements satisfy 

M < c L e- VLle ' kl £,keZ d . (2.20) 

The following properties hold. 

Property 2.1 (continuity of A) // either A E T> a {j]£), with t)l > d, or A £ T> e {r\i), then A 
defines a bounded operator on £ 2 (Z d ). 

Proof. See e.g. P2HSJ. 



9 



Property 2.2 (inverse of A: algebraic case) If A £ T> a {r]i), with rji > d and A is invert- 
ible in £ 2 (Z d ), then A" 1 G V a (n L ). 

Proof. See e.g. [32]. □ 



Property 2.3 (inverse of A: exponential case) If A £ V e (rji) and there exists a constant 
cl satisfying (|2.20p such that 

cl < -(e" L - l)mina v , (2.21) 

i/ien A is invertible in £ 2 (Z d ) and A~ x G T> e {f)i) where t}l £ (0,?/l] *s suc/i i/iai z = e - ^ is i/ie 
unique zero in the interval (0,1) of the polynomial 

2 e 2 ^ + 2c L + 1 

£ t + 1 . 

e^(c L + l) 

Proof. We follow the suggestion by Bini [3], and thus exploit the one-to-one correspondence 
between Toeplitz matrices and formal Laurent series (see e.g. [3]): 

oo 

f( z )= Yl akzk < ^ T / = (^.i)' ''••/ "' r 

k=— oo 

We refer to the function f(z) as to the symbol associated to the Toeplitz matrix Tf. We recall 
now a few relations between f(z) and Tf. If f(z) is analytic on A a = {z G C : e _Qf < \z\ < e a } 
with a > 0, then there holds f(z) = ^2k=-oo a k zk i where the coefficients a/% have exponential 
decay with rate e~ a in the sense that for every < p < e~ a there exists a constant 7 > such 
that |cifc| < 7p' fc L As a consequence, the symbol f(z) of the Toeplitz matrix Tf is analytic on A a 
for some a > if and only if the elements of Tf decay exponentially with rate e~ a . Moreover, 
it is known that if f(z) is analytic on A a and it is non-zero on As C A a , then the function 
g(z) = l/f(z) is well defined and analytic on Ap, the matrix T g is the inverse of Tf and the 
elements of T„ decay exponentially with rate . 
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We next introduce the analytic functions in A D 



z 1 



h(z) = Y,e- ak (z k + z- k ) =37— + fc(z) = l-ch(z) 



z 
k=l 



with c > 0. For \z\ = 1 we deduce \h{z)\ < 2*£^ =1 e- ak = 2/(e a - 1), whence c\h(z)\ < 
1 provided that c < ^(e a — 1); moreover ||T/j|| < IIT/Jloo = 2/(e° — 1), which is indeed a 
particular instance of Schur Lemma for symmetric matrices. For this range of c's, f c (z) 7^ for 
\z\ = 1 and for continuity there exists Ap C A a on which f c (z) in non-zero. This implies that 
g c (z) ■= 1/ f c {z) is analytic on A/3 and the elements of the associated Toeplitz matrix T gc decay 
exponentially with rate e - ^. The singularities of g c correspond to zeros of / c , which are in turn 
the roots £i , Q2 of the polynomial 



2 e 2a + 2c + 1 



z+1. 



e a (c+l) 

These roots are real provided c < |(e Q — 1), in which case e~^ = £1 = C^ 1 < 1. 
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Let A 6 D e (a), i.e. there exists a constant c such that \a,£k\ < ce~ Q '^~ fc ' for £,k & Z d . 
By rescaling of the rows of A, it is not restrictive to assume that the diagonal elements A are 
equal to 1. Then, it is possible to write A = I — S with |S| < cT\, the inequality being meant 
element by element, and ||S|| < 1. Since g c {z) = 1/(1 — ch(z)) = z~2T=o ^ 'h{ z ) k ls wen defined 
and analytic on Ap C A a , it follows that 



fc=0 



< £ isi fe < x> fc T* = t Sc . 



k=0 k=0 



Hence, the elements of the matrix T gc decay exponentially with rate e - ^. Property ||S|| < 1 
yields A -1 = (I — S) _1 = z~2T=o ^ k an d |A _1 | < T 9c , whence the coefficients of A -1 being 
bounded by those of T gc decay exponentially with rate e - ^, i.e. A -1 6 T) e {f3) f° r some (3 < a. 
This gives (|2.2ip once the row scaling of A is taken into account. □ 

Example 2.1 (sharpness of (|2.2ip ) The following example illustrates that ()2.21[) is sharp. 
Let A be 

o>ij — 2 \ i\ i djj — 1, 

which is singular because the sum of the coefficients in every row vanishes. This A corresponds 
to = 2, c L = \ and \{eP L - 1) = \, which violates (I23T1) . 

For any integer J > 0, let Aj denote the following symmetric truncation of the matrix A 

(A ) -\ a ^ k if l^-^ J > rooo^ 

(Aj)£ k = < (2.22) 
I U elsewhere. 

Then, we have the following well-known results, whose proof is reported for completeness. 

Property 2.4 (truncation) The truncated matrix Aj has a number of non-vanishing entries 
bounded by 0JdJ d , where uj^ is the measure of the Euclidean unit ball in R rf . Moreover, under the 
assumption of Property \2.1\ there exists a constant Ca such that 

' (J + ly^L-d) if Ae T>a(VL) (algebraic case) , 
( J + l) d ~ 1 e~ riLj if A G V e {r)i) (exponential case) , 



|A-Aj|| < Va(J,t?) :=C a 



for all J > 0. Consequently, under the assumptions of Property \2.2\ or \2.3l one has 

HA- 1 - (A-^H <Va- (J,Vl) (2.23) 
where we let f]L = r]L in the algebraic case and f]L be defined in Property 1 2. 3\ for the exponential 



case. 



Proof. We use the Schur Lemma for symmetric matrices, ||B|| < HBHoo = sup^ \be : k\ for 
B = A — Aj. Thus, in the algebraic case 



sup £ M < c L su P £ JTTW-- 

k:\l- k\>J 

CO 

y y „ \ < su p y r^—^ < 



k:\£-k\>J k:\£-k\>J ' 1 u 

oo 

< sup V V - — < mid V — ^ - iJ + i)"-" L 



q=J+l k:\l-k\=q 

A similar argument yields the result in the exponential case. 
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2.5 An equivalent formulation of the Galerkin problem 



For future reference, herafter we rewrite the Galerkin problem (|2.13|) in an equivalent (infinite- 
dimensional) way. Let 

P A : £ 2 {Z d ) -»• £ 2 (Z d ) 

be the projector operator defined as 



(Pav); 



v\ if A G A , 
if A 4 A . 



Note that Pa can be represented as a diagonal bi-infinite matrix whose diagonal elements are 
1 for indexes belonging to A, zero otherwise. Let us set Qa = I — Pa and we introduce the 
bi-infinite matrix Aa := Pa APa + Qa which is equal to Aa for indexes in A and to the identity 
matrix, otherwise. The definitions of the projectors Pa and Qa yield the following result. 

Property 2.5 (invertibility of A) If A is invertible with either A G T> a (rfi,) or A £ T> & {t]l), 
then the same holds for A\. 

Now, let us consider the following extended Galerkin problem: find u G £ 2 (Ij d ) such that 

A A u = P A f • (2.24) 

Let E A : Cl A l -> £ 2 {Z d ) be the extension operator defined in Sect. E3]and let u A G Cl A l be the 
Galerkin solution to f)2 . 13|) : then, it is easy to check that u = Eaua- 

In the following, with an abuse of notation, the solution of (|2.24p will be denoted by ua- 
We will refer to it as to the (extended) Galerkin solution, meaning the infinite-dimensional 
representant of the finite-dimensional Galerkin solution. In case of possible confusion, we will 
make clear which version (infinite-dimensional or finite-dimensional) has to be considered. 



3 Adaptive algorithms with contraction properties 

Our first algorithm will be an ideaJ one; it will serve as a reference to illustrate in the simplest 
situation the contraction property which guarantees the convergence of the algorithm, and it 
will be subsequently modified to get more efficient versions. The ideal algorithm uses as error 
estimator the ideal one, i.e., the norm of the residual in i?" 1 (£7) ; we thus set, for any v G Hp(Q), 

ri 2 (v) = \\r(v)f= (3-1) 

so that (|2.8p can be rephrased as 

-— v( u a) < \\u - u A \\ < —f](u A ) ; (3.2) 

a* a* 

recall that R^(v) = (1 + \k\ 2 )~ l l 2 rh(v) according to (|2.2p . Obviously, this estimator is hardly 
computable in practice; in Sect. 13.21 we will introduce a feasible version, but for the moment we 
go through the ideal situation. Given any subset A C Z rf , we also define the quantity 

n 2 (v;A) = \\P A r(v)\\ 2 = Y,\Rk(v)\ 2 , 

so that 77(f) = r](v;Z ,). 



12 



3.1 ADFOUR: an ideal algorithm 

We now introduce the following procedures, which will enter the definition of all our adaptive 
algorithms. 

• u A := GAL(A) 

Given a finite subset A C Z d , the output u A G Vk is the solution of the Galerkin problem 
f[276]l relative to A. 

• r : = RES(t7 A ) 

Given a function v\ £ V\ for some finite index set A, the output r is the residual r(v\) = 
f ~ Lv A . 

• A* := DORFLER(r, 9) 

Given 6 £ (0, 1) and an element r £ the ouput A* C Z rf is a finite set such that 
the inequality 

\\Pa*t\\ > 9\\r\\ (3.3) 

is satisfied. 

Note that the latter inequality is equivalent to 

\\r-P A *r\\ < Vl-0 2 ||r|| • (3.4) 

If r = r(u\) is the residual of a Galerkin solution u\ £ V A , then by (|2.7p we can trivially assume 
that A* is contained in A c := Z d \ A. For such a residual, inequality (|3.3p can then be stated as 

77(ua;A*)>0»7(«a), (3.5) 

a condition termed Dorfler marking in the finite element literature, or bulk chasing in the wavelet 
literature. Writing = Rk(u\), the condition (|3.5j) can be equivalently stated as 

\R k \ 2 >e 2 J2\Rk\ 2 - (3.6) 

fceA* kgA 

Also note that a set A* of minimal cardinality can be immediately determined if the coefficients 
Rk are rearranged in non-increasing order of modulus; however, the subsequent convergence 
result does not require the property of minimal cardinality for the sets of active coefficients. 
In the sequel, we will invariably make the following assumption: 

Assumption 3.1 (Dorfler marking) The procedure DORFLER selects an index set A* of 
minimal cardinality among all those satisfying condition 113. 3\) . 

Given two parameters 9 £ (0, 1) and tol 6 [0, 1), we are ready to define our ideal adaptive 
algorithm. 

Algorithm ADFOUR(6>, tol) 

Set r := /, A := 0, n = -1 

do 

n <— n + 1 
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dA n := DORFLER(r n , 0) 

A n+ i :=A„U dAn 
u n+ i := GAL(A„ + i) 
r n+1 : = RES(u n+ i) 

while \\r n+ i || > tol 

The following result states the convergence of this algorithm, with a guaranteed error reduc- 
tion rate. 

Theorem 3.1 (convergence of ADFOUR) Let us set 



p = p(9) = Jl-^e(0,l). (3.7) 

Let {A n , u n } n >o be the sequence generated by the adaptive algorithm ADFOUR. Then, the 
following bound holds for any n: 

\u - u n+ i\\ < p\\u - u n \\ . 

Thus, for any tol > the algorithm terminates in a finite number of iterations, whereas for 
tol = the sequence u n converges to u in H^ifl) as n — > oo. 

Proof. For convenience, we use the notation e n := \\u — u n \\ and d n := |||tin+i — u n \. As 
V\ n C V\ n+1 , the following orthogonality property holds 

e n+l = e n~ dn- (3-8) 



On the other hand, for any w G Hp(fl), one has in light of (12.5P 

(Lw,v) a(w,v) \\v\\ , — 

\\Lw\\ = sup — — - — = sup — rr—r — < sup Tj 77 < va*\jw\\ 

veH^n) \\ v \\ veHj(Q) \\ v \\ y&H^(n) \\ v \\ 



Thus, using (I3.3j) . 

d 2 n > \\\L(u n+ i - u n )\\ 2 = ^tIK+i - r n \\ 2 
a* a* 

> ;]rll P A„ +1 (r„ + i - r n )\\ 2 = i-||P An+1 r n || 2 > ^||r n || 2 . 

On the other hand, the rightmost inequality in (|2.9p states that ||r ra || 2 > a*e 2 , whence the result. 
□ 



3.2 F- ADFOUR: A feasible version of ADFOUR 

The error estimator w(ua) based on (|3.ip is not computable in practice, since the residual 
r(u\) contains infinitely many coefficients. We thus introduce a new estimator, defined from an 
approximation of such residual with finite Fourier expansion (i.e., a trigonometric polynomial). 
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To this end, let v, a and / be suitable trigonometric polynomials, which approximate u, a and 
/, respectively, to a given accuracy. Then, the quantity 

r(uA) = f ~ Lu A = f + V • (z>Vu A ) - au A (3.9) 

belongs to for some finite subset A C 7L d , i.e., it has the finite (thus, computable) expansion 

r(uA) = ^2f k {u K )4> k . 

feeA 

The choice of the approximate coefficients has to be done in order to fulfil the following condition: 
for a fixed parameter 7 G (0,0), we require that 

||r(«A)-r(t* A )|| <7l|r(«A)|| • (3-10) 

Satisfying such a condition is possible, provided we have full access to the data. Indeed, on the 
one hand, the left-hand side tends to as the approximation of the coefficients gets better and 
better, since (we keep here the full norm indication for a better clarity) 

\\r(u A ) - r(n A )|| H -i (n) < ||/ - /|| H -i {n) + \W ~ ^lli/^n) II Vu A || L 2 (n)< i + \\a - 5-|U«(n)ll«A||ra(n) 

1 

a* 



^ 11/ - /IIh- 1 ^) + (IW - v\\L°°{n) + \W ~ & IU»(n))— \\j \\ H -' { ii) • 



where we have used the bound on the solution of the Galerkin problem (|2.6p in terms of the 
data. On the other hand, if u\ / u, then r(u\) ^ 0, whence the right-hand side of (I3.10p 
converges to a non-zero value as A increases. 

With this remark in mind, we define a new error estimator by setting 

f, 2 (u K ) = \\r(u A )f = \k(u A )\ 2 , (3.11) 



keA 



which, in view of (|3.10p . immediately yields 



1 — 7 1 + 7 

— —fj( u A) < \\u ~ ua\\ < fj{UA) ■ (3.12) 

a* a* 

Lemma 3.1 (feasible Dorfler marking) Let A* be any finite index set such that 

v( u A', A*) > 9fj(u A ) . 

Then, 

r](u A ; A*) > 9rj(u A ) , with 6 = -—L G (0,0) . (3.13) 

1 + 7 

Proof. One has 

\\PA*r(u A )\\ > ||PA»f(u A )|| - ||Pa* (r{u A ) - r(u A )) \\ 

> 9\\f(u A )\\ - \\r(u A ) - r(u A )\\ 

> (e- 7 )[|f(«A)[|>^||r(ti A )|| 1 

1 + 7 

which is the desired (|3.13|) . □ 
The previous result suggests introducing the following feasible variant of the procedure RES: 
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• r : = F-RES(w A ,7) 

Given 7 G (0, 9) and a function «a £ Va for some finite index set A, the output r is an 
approximate residual r(v A ) = / + V- (DVv\) — av\, denned on a finite set A and satisfying 

\\r(v A ) - t(va)\\ < 7||^a)|| • 

Theorem 3.2 (contraction property of F-AFOUR) Consider the feasible variant F-ADFOUR 
of the adaptive algorithm ADFOUR, where the step r n+ \ := RES(u n+ i) is replaced by the step 
r n+ \ := F-RES(u n+ i, 7) for some 7 G (0,6*). Then, the same conclusions of Theorem \3.1\ hold 
true for this variant, with the contraction factor p replaced by p = p(9), where 9 is defined in 

(KM - □ 



In the rest of the paper, we will develop our analysis considering Algorithm ADFOUR 
rather than F-ADFOUR; this is just for the sake of simplicity, since all the conclusions extend 
in a straightforward manner to the latter version as well. 

3.3 A-ADFOUR: An aggressive version of ADFOUR 

Theorem 13.11 indicates that even if one chooses 9 very close to 1, the predicted error reduction 
rate p = p(9) is always bounded from below by the quantity ^/l — pF. Such a result looks 
overly pessimistic, particularly in the case of smooth (analytic) solutions, since a Fourier method 
allows for an exponential decay of the error as the number of (properly selected) active degrees 
of freedom is increased. Fig 13.31 displays the influence of Dorfler parameter on the decay rate 
and number of solves: choosing 9 closer to 1 does not significantly affect the rate of decay of the 
error versus the number of activated degrees of freedom, but it significantly reduces the number 
of iterations. This in turn reduces the computational cost measured in terms of Galerkin solves. 

Motivated by this observation, hereafter we consider a variant of Algorithm ADFOUR, 
which - under the assumptions of Property 12.21 or 12.31 - guarantees an arbitrarily large error 
reduction per iteration, provided the set of the new degrees of freedom detected by DORFLER 
is suitably enriched. 

At the n-th iteration, let us define the set A ra+ i := A n U dA n by setting 

dk n :=DORFLER(r n , 9) 

1 (3.14) 
dk n :=ENRICH(9A n , J) , 

where the latter procedure and the value of the integer J will be defined later on. We recall 
that the set dA n is such that g n = r n satisfies 

IK - g n \\ < v 7 ! - 2 \\r n \\ 

(see (13. 4)1 ). Let w n G V be the solution of Lw n = gn, which in general will have infinitely many 
components, and let us split it as 

W n = Ph n+1 Wn + PA c n+1 Wn =■ Vn + Z n G V An+1 V A ^ +1 . 
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Figure 1: Residual norm vs number of degrees of freedom activated by ADFOUR, for different 
choices of Dorfler parameter 9] solid line: 9 = 1 — 10 -1 ; dash-dotted line: 9 = 1 — 10 -2 ; dashed 
line: 9 = 1 — 10 -3 . The symbols (circles, diamonds, stars) identify the various ADFOUR 
iterations for the sample ID problem (|2.3p with analytic solution u(x) = exp(cos2x + sinx) and 
coefficients with u = 1 + \ sin3x and a = exp(2cos3x). 
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Then, by the minimality property of the Galerkin solution in the energy norm and by (|2.5|) and 
dUSD , one has 

\U - U n+ l\\ < \\u - (u n + y n )\\ < \\u - u n - w n + z n \\ 

< —==.\\L(u -u n - w n )\\ + \/a*||z n || = — — \\r n - g n \\ + \fa*\\z n \\ . 



/a* 



Thus, 



1 



\\u - u n+ i\\ < —=\/ (1 - 2 ) \\r n \\ + \/a*||z n || . 
s/ot* 



Now we can write z n = \P\<^ L 1 Pq^ )r n ; hence, if A ra+ i is defined in such a way that 
k G k c n+1 and £ G dA n \k - l\ > J , 

then we have 

where we have used (|2.23p . Now, J > can be chosen to satisfy 



'i-e 2 



a*cr 



(3.15) 



in such a way that 



i / * \ ^l 2 

i|||<-^vT^2|M< — V~i 



\\U - U n + 



U — U r , 



(3.16) 
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Note that, as desired, the new error reduction rate 

1/2 



P 



— Vl-0 2 (3.17) 

a* 



can be made arbitrarily small by choosing 9 arbitrarily close to 1. The procedure ENRICH is 
thus defined as follows: 

• A* := ENRICH(A, J) 

Given an integer J > and a finite set A C the output is the set 

A* := {k € Z d : there exists £ £ A such that \k — £\ < J} . 

Note that since the procedure adds a <i-dimensional ball of radius J around each point of A, the 
cardinality of the new set A* can be estimated as 



|A*| < \B d (0, J) n Z a \ |A| ~ u d J a \A\ , (3.18) 

where ojd is the measure of the <i-dimensional Euclidean unit ball Bd(0, 1) centered at the origin. 

It is convenient for future reference to denote by dA n := E-DORFLER(r n , 9, J) the proce- 
dure described in (|3,14p . We summarize our results in the following theorem. 

Theorem 3.3 (contraction property of A-ADFOUR) Consider the aggressive variant A- 
ADFOUR of the adaptive algorithm ADFOUR, in which the step dA n := DORFLER(r n , 9) 

is replaced by 

dA n := E-DORFLER(r n , 9, J) , 



where 9 is such that p defined in (3.11) is smaller than 1, and J is the smallest integer for 
which A3.15\) is fulfilled. Let the assumptions of Property \2.2\ or \2.3\ be satisfied. Then, the same 
conclusions of Theorem \3.1\ hold true for this variant, with the contraction factor p replaced by 
p. □ 



3.4 C-ADFOUR and PC-ADFOUR: ADFOUR with coarsening 

The adaptive algorithm ADFOUR and its variants introduced above are not guaranteed to 
be optimal in terms of complexity. Indeed, the discussion in the forthcoming Sect. [5] for the 
exponential case will indicate that the residual r(u\) may be significantly less sparse than the 
corresponding Galerkin solution ma; in particular, we will see that many indices in A, activated 
in an early stage of the adaptive process, could be lately discarded since the corresponding 
components of ua are zero. For these reasons, we propose here a new variant of algorithm 
ADFOUR, which incorporates a recursive coarsening step. 

The algorithm is constructed through the procedures GAL, RES, DORFLER already 
introduced in Sect. 13.11 together with the new procedure COARSE defined as follows: 

• A := COARSE(u;,e) 

Given a function w G V\* for some finite index set A*, and an accuracy e which is known 
to satisfy ||u — w\\ < e, the output A C A* is a set of minimal cardinality such that 

\\w - P A w\\ < 2e . (3.19) 
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We will subsequently show (see Theorem 16, ip that the cardinality |A| is optimally related to 
the sparsity class of u. The following result will be used several times in the paper. 

Property 3.1 (coarsening) The procedure COARSE guarantees the bounds 

||«-PaHI < 3e ( 3 - 20 ) 

and, for the Galerkin solution u\ G Va, 

\\u - u A \\ < 3Va*e . (3.21) 

Proof. The first bound is trivial, the second one follows from the minimality property of the 
Galerkin solution in the energy norm and from f)2 . 5|) : 



HI — «a||| < \\u — Paw I < -v/o^Hii — P\w|| < 3Va*e . □ 

Given two parameters 9 G (0, 1) and tol G [0, 1), we define the following adaptive algorithm 
with coarsening. 

Algorithm C-ADFOUR((9, tol) 

Set r := /, A := 0, n = -1 

do 

n <— n + 1 

set A„ i0 = A n , r nfl = r n 

k = -l 

do 

k <r- k + 1 

dK n ,k '■= DORFLER(r n /;, 6) 

A n ,fc+i := A nj fc U dA n ^k 

Un,k+l ■= GAL(A nj fc + i) 
r n,k+l '■= RES(tt nifc+1 ) 

while ||r njfc+1 || > \f\ - 6» 2 ||r n || 

A n+ i := COARSE (u n>k+1 , ^■|kn,fc+ill) 

u n+ i := GAL(A n+ i) 

r n+ i := RES(-u n+ i) 

while ||r„ + i || > tol 

We observe that the specific choice of accuracy e = e n = ||r n fc +1 1| in each call of COARSE 
in the algorithm above is motivated by the wish of guaranteeing a fixed reduction of the residual 
and error at each outer iteration. This is made precise in the following theorem. 

Theorem 3.4 (contraction property of C-ADFOUR) The algorithm C-ADFOUR sat- 
isfies 
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(i) The number of iterations of each inner loop is finite and bounded independently of n; 



(ii) The sequence of residuals r n and errors u — u n generated for n > by the algorithm satisfies 
the inequalities 

\\r n +i\\ < p\\r n \\ (3.22) 

and 

\\u - u n+ i\\ < p\\u - u n \\ (3.23) 

for 



3^V / I^. 



a* 



(3.24) 



In particular, if 8 is chosen in such a way that p < 1, for any tol > the algorithm 
terminates in a finite number of iterations, whereas for tol = the sequence u n converges 
to u in Hp(Q) as n — > oo. 

Proof, (i) For any fixed n, each inner iteration behaves as the algorithm ADFOUR considered 
in Sect. 13.11 Hence, setting again p = y^l — ^k(P, we have as in Theorem 13.11 

ju - u n! k + i\\ < p k+1 \\u - u n \\ , 

which implies, by (|2.9p . 

Ikn,fc+i|| < Va*||u - u n ,k+l\\ < Va*p k+1 \\u - u n \\ < \[^-p k+ 11 " 1 
This shows that the termination criterion 

||r„ )fc+ i|| < \/l - 2 \\r n \\ 



(3.25) 



is certainly satisfied if 



i.e., as soon as 



- P fe+i <yr^, 

a* 



:^q-* 2 )) 

21ogp 



log -e 2 )) 

k + l> 6Va 7 — > k . 

21ogp 

We conclude that the number K n = k + 1 of inner iterations is bounded by 1 + 

which is independent of n. 

(ii) By (piBjl . we have 

1 . 

\\u - u n k+1 \\ < — ||r n! fc + i|| . 
a* 

At the exit of the inner loop, the quantity on the right-hand side is precisely the parameter e r , 
fed to the procedure COARSE; then, Property 13.11 yields 

III -ti - iin+i||| < 3\/o*e n . 
On the other hand, the termination criterion (|3.25[) yields 

e n < — y/l- 2 \\r n \\ , 
a* 



20 



so that 

- u n+ i\\ < 3— — \A - 6 2 \\r n \\ . 

a* 

This bound together with the left-hand inequality in (|2.9p applied to r n +i yields (|3.22p . whereas 
the same inequality applied to r n yields (I3,23h , □ 

A coarsening step can also be inserted in the aggressive algorithm A-ADFOUR considered 
in Sect. 13.31 indeed, the enrichment step ENRICH could activate a larger number of degrees 
of freedom than really needed, endangering optimality. The algorithm we now propose can be 
viewed as a variant of C-ADFOUR, in which the use of E-DORFLER instead of DORFLER 
allows one to take a single inner iteration; in this respect, one can consider the enrichment step 
as a "prediction" , and the coarsening step as a "correction" , of the new set of active degrees of 
freedom. For this reason, we call this variant the Predictor/Corrector- ADFOUR, or simply 
PC-ADFOUR. 

Given two parameters 9 G (0, 1) and tol 6 [0, 1), we choose J > 1 as the smallest integer for 
which (I3.15P is fulfilled, and we define the following adaptive algorithm. 

Algorithm PC-ADFOUR(<9, tol, J) 
Set r := /, A := 0, n = -1 
do 

n n + 1 

dk n := E-DORFLER(r n , 6, J) 

A n+ i := A n U dk n 
u n +i ■= GAL(A n ) 

A n+1 := COARSE (u n+ i, ^ ^T^^"| | ^ 1 1 ) 
u n+ i := GAL(A n+ i) 
r n+ i := RES(/u n+ i) 

while ||r n +i || > tol 

Theorem 3.5 (contraction property of PC-ADFOUR) If the assumptions of Property \2.2l 
or Property \2.3\ be satisfied, then the statement (ii) of Theorem \3.4\ applies to Algorithm PC- 
ADFOUR as well. 



Proof. The first inequalities in both (|3. 16[) and (|2.5p yield 



\\u - u n +i\\ < — \A - G 2 \\ r n\\ ■ 
a* 

Since the right-hand side is precisely the parameter e n fed to the procedure COARSE, one 
proceeds as in the proof of Theorem 13.41 □ 
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4 Nonlinear approximation in Fourier spaces 
4.1 Best iV-term approximation and rearrangement 

Given any nonempty finite index set AcZ d and the corresponding subspace V\ C V = Hp(Q) 
of dimension |A| = card A, the best approximation of v in V\ is the orthogonal projection of v 
upon V\, i.e. the function P\v = J2keAVk<ftki which satisfies 

1/2 

\v-P A v\\ 




(we set Pav = if A = 0). For any integer N > 1, we minimize this error over all possible 
choices of A with cardinality N, thereby leading to the best N-term approximation error 

En(v) = inf \\v — Pa^II • 
AcZ d , |A|=JV 

A way to construct a best N-term approximation vn of v consists of rearranging the coefficients 
of v in decreasing order of modulus 

\V kl \>--->\Vk n \>\Vk n+ i\>--- 

and setting = P\ N v with A at = {k n : 1 < n < N}. As already mentioned in the 
Introduction, let us denote from now on v * = Vk n the rearranged and rescaled Fourier coefficients 
of v. Then, 

• : 1 

E N (v) = ( 

\n>N 



,n>N / 



Next, given a strictly decreasing function 4> : N — > M + such that 0(0) = 4>q for some 4>q > 
and 4>(N) — > when — > oo, we introduce the corresponding sparsity class A<f> by setting 

A* = {v€V: \\v K := sup ^ < + oo} . (4.1) 

We point out that in applications ||v||^ need not be a (quasi-)norm since A$ need not be a linear 
space. Note however that ||f ||^ always controls the F-norm of v, since ||t> || = Eq(v) < (PoWvWa^- 
Observe that v G iff there exists a constant c > such that 

E N (v) < c4>(N) , VA^>0. (4.2) 

The quantity \\v ||^ dictates the minimal number A^ e of basis functions needed to approximate 
v with accuracy e. In fact, from the relations 

E Ne (v) < e < E Ne -!(v) < 4>{N £ - l)\\v\\j^ , 

and the monotonicity of (J), we obtain 

n £ < r 1 (t4—) + 1 • ( 4 - 3 ) 
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The second addend on the right-hand side can be absorbed by a multiple of the first one, provided 
e is sufficiently small; in other words, it is not restrictive to assume that there exists a constant 
k slightly larger than f such that 



Remark 4.1 (sparsity class for V) Replacing V by V in (|4,ip leads to the definition of a 
sparsity class, still denoted by A.^, in the space of linear continuous forms / on Hp(Q). This 
observation applies to the subsequent definitions as well (e.g., for the class Aq ). In essence, we 
will treat in a unified way the nonlinear approximation of a function v E Hi (O) and of a form 



Throughout the paper, we shall consider two main families of sparsity classes, identified by 
specific choices of the function <p depending upon one or more parameters. The first family 
is related to the best approximation in Besov spaces of periodic functions, thus accounting 
for a finite-order regularity in fi; the corresponding functions 4> exhibit an algebraic decay as 
N — > oo, which motivates our terminology of algebraic classes. The second family is related to 
the best approximation in Gevrey spaces of periodic functions, which are formed by infinitely- 
differentiable functions in f2; the associated </>'s exhibit an exponential decay, and for this reason 
such classes will be referred to as exponential classes. Properties of both families are collected 
hereafter. 

4.2 Algebraic classes 

The following is the counterpart for Fourier approximations of by now well-known nonlinear 
approximation settings [11], e.g. for wavelets or nested finite elements. For this reason, we just 
state definitions and properties without proofs. 
For s > 0, let us introduce the function 

cj)(N) = N- s/d for JV > 1 , (4.5) 

and <^>(0) = (j)Q > 1 arbitrary, with inverse 

(f)- l (\) = \- d l s forA<l, (4.6) 

and let us consider the corresponding class A$ defined in f)4. 1[) . 

Definition 4.1 (algebraic class of functions) We denote by A S B the subset ofV defined as 

A%:= \v £ V : \\v\\ A s ■= \\ v \\ + sup E N (v)N s/d < +oo\ . 

B ATM J 

It is immediately seen that A S B contains the Sobolev space of periodic functions Hp +1 (VL). On 
the other hand, it is proven in |12j . as a part of a more general result, that for < a, r < oo, the 
Besov space fi^O) = B s a +1 (L T (n)) is contained in A% provided s* := s - d(l/r - 1/2)+ > 0. 
Let us associate the quantity r > to the parameter s, via the relation 

1 _ s 1 
t ~ d + 2 ' 
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The condition for a function v to belong to some class A S B can be equivalently stated as a 
condition on the vector v = iVk)k& d °f its Fourier coefficients, precisely, on the rate of decay 
of the non-increasing rearrangement v* = (i^) n >i of v. 

Definition 4.2 (algebraic class of sequences) Let l s B {7L d ) be the subset of sequences v G 
£ 2 (Z d ) so that 

IMI«,(Z«*) : = supn 1/r |<| < +00 . 

n>l 

Note that this space is often denoted by l 1 w {'L d ) in the literature, being an example of Lorentz 
space. 

The relationship between A S B and £ B (Z, d ) is stated in the following Proposition. 

Proposition 4.1 (equivalence of algebraic classes) Given a function v G V and the se- 
quence v of its Fourier coefficients, one has v G A S B if and only if v G l B (l* d ), with 

\\ V \U% < IMI^(zd) < ||u|U. . 

At last, we note that the quasi-Minkowski inequality 

\\ u + v h s B (z d ) < C s + \\v \\e B (z d )) 

holds in l s B (JL d \ yet the constant C s blows up exponentially as s — > 00. 

4.3 Exponential classes 

We first recall the definition of Gevrey spaces of periodic functions in Q = (0, 2ir) d (see |14j). 
Given reals rj > 0, < t < d and s > 0, we set 

G^' s (n) :={v€L 2 (n) : \\v\\ 2 G ^ s = £ e«(l + \k\ 2s )\v k \ 2 < +00} . 

Note that Crp*' s (f2) is contained in all Sobolev spaces of periodic functions iiZI(O), r > 0. 
Furthermore, if t > 1, Gp t,s (r2) is made of analytic functions. 

Gevrey spaces have been introduced to study the C°° and analytical regularity of the solu- 
tions of partial differential equations. For our elliptic problem f|2.3[) . the following statement is 
an example of shift theorem in Gevrey spaces. 

Theorem 4.1 (shift theorem) If the assumptions of Property \2.3\ are satisfied, then for any 
n < fji, < t < 1 and s > —1, L is an isomorphism between G T p t ' s+ ' 2 (O) and Gp ,i: ' s (fl). 

Proof. Proceeding as in Sect. 12.31 it is immediate to see that the problem Lu = f can be 
equivalently formulated as Au = f , where the vectors f and u contain the Fourier coefficients 
of functions / and u normalized in Hp(Q) and H^ +2 (Q), respectively. If W = diag(e r? ' fc ' ) is a 
bi-infinite diagonal exponential matrix, then we can write Wu = WA _1 f = (WA _1 W _1 )Wf . 
We observe that property ||Wu||^a < ||Wf||^2, which implies the thesis, is a consequence of 
|| WA^W- 1 1^2 < 1. 

To show the latter inequality, we let x, y G ( 2 { r L d ) and notice that 

I^WA-^xl < c L £ E Iwle^'V^NI. 

m£Z d keZ d 
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Since < t < 1, we deduce \m + k\* < |m|* + and evQm+k]*-^) < e v\m\* ^ whence 
llWA^W^xH = sup lyTW ^"l W " lx| < c L V e^+^l-l^xH < ||x|| 

because fjz > V and the series converges. This implies the desired estimate. □ 

From now on, we fix s = 1 and we normalize again the Fourier coefficients of a function v 
with respect to the Hp (J7)-norm. Thus, we set 

Gf (O) = Gf\VL) = {v € V : ||t;|| 2 Gi??)t = £ e 2 "^ |V^| 2 < +00} . (4.7) 

k 

Functions in Gp 4 (f2) can be approximated by the linear orthogonal projection 

Pmv = V k 4>k , 

\k\<M 

for which we have 

\\v-P M vf = £ |y fc | 2 = ^ e-^^e 2 ^!^! 2 

|fc|>M |&|>Af 
|fc|>M 

As already observed in Property 12.41 setting TV" = card{/c : < M}, one has iV ~ uidM d , so 
that 

< ||« " Pmv\\ < exp (-nu~; t/d N t ^) \\v\\ G , v ,t • (4.8) 



Hence, we are led to introduce the function 

<j>(N) = exp (j-rjuj t/d N t/d ^ (N > 0) , (4.9) 

whose inverse is given by 

(1 \ d/t 
log J (A<1), (4.10) 

and to consider the corresponding class Aj, defined in (|4.ip . which therefore contains Gp (fl). 

Definition 4.3 (exponential class of functions) We denote by Aq* the subset of 
defined as 

A V J:= \v e V : \\v\\. v ,t := sup J B 7 v(w)exp (r]Co7 t/d N t/d ) < +00) . 

At this point, we make the subsequent notation easier by introducing the t-dependent func- 
tion 

As in the algebraic case, the class Aq can be equivalently characterized in terms of behavior of 
rearranged sequences of Fourier coefficients. 
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Definition 4.4 (exponential class of sequences) Let£ 1 ^ t (Z d ) be the subset of sequences v G 
£ 2 {Z d ) so that 

||v||^7,*( Z ci) := supn( 1-r )/ 2 exp (r]oo d T n T ) |u*| < +00 , 

where v* = (w*)^ =1 is the non-increasing rearrangement of v. 

The relationship between A V q and £g*(Z d ) is stated in the following Proposition. 

Proposition 4.2 (equivalence of exponential classes) Given a function v £V and the se- 
quence v = (Vfc)fc e ^ of its Fourier coefficients, one has v G A V q if and only if v G £^{Z d ), 
with 

IML^ ^ ll v ll^'(z d ) ~ IMU^ • 

Proof. Assume first that v G t^(Z d ). Then, 



i?^) 2 = ik - p^ii 2 = 2 ki 2 s E nT_lex p (- 2 ^r^ T ) iiv 

n>7V n>V 

Now, setting for simplicity a = 2-1]^ T , one has 

/»oo 

S ■= V n 1 -^-^ ~ / x T - 1 e- Qa:T dx . 
n>7V ^ 

The substitution z = x T yields 



d r x 



e~ az dz = ± e - aN ' 
at 



whence IMI^.t < ||v||^,t^ Z d^. Conversely, let v G Aq*. We have to prove that for any n > 1, 
one has 

n^KI 2 < e— T \\v\\ Ar . 

Let m < n be the largest integer such that n — m > n 1_r (note that < 1 — r < 1), i.e., 
m ~ n(l — n _r ). Then, 

n 

n^Kf^in-m)^ 2 ^ £ k,*| 2 < \\v - P m (v)\\ 2 < e"^ \\v\\ 2 AV , t . 

j=m+l 

Now, by Taylor expansion, 

m T ~ n T (l - n~ T ) T = n T (l - rn~ T + o(n~ T )) = n T - r + o(l) , 
so that e ~ amT < e~ anT , and ||v||^,t/ z<J N < ||f||_4>7,* is proven. 



I AV,> 



Next, we briefly comment on the structure of the set £^*(Z d ). This is not a vector space, 
since it may happen that u, v belong to this set, whereas u + v does not. Assume for simplicity 
that r = 1 and consider for instance the sequences in 

u = (e-^,0,e- 2 ^,0,e- 3 ^,0,e- 4 ^,0,...) , 
v = (0,e^,0,e- 2 ^,0,e- 3 ^,0,e- 4 ^,...) , 
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Then, 

u + v = (u + v)* = (e" 7? ,e- r ',e- 2, ',e- 2r ',e- 3r ',e- 3r ',e- 4? ',e- 4r ',...) ; 

thus, (u + v)^- = e -, w, so that e" 2 ^(u + v)^- -> oo as j -> +00, i.e., u + v $ £ v ^(Z d ). On the 
other hand, we have the following property. 

Lemma 4.1 (quasi-triangle inequality) 7/uj 6 ^'*(Z d ) fori = 1,2, i/ten U1+U2 G ^*(Z d ) 

n -i - 1 - 1 

ui + u 2 U,t < ui L^.t + u 2 U a ,t, 7/ T =?7i T +% T - 

Proof. We use the characterization given by Proposition 14.21 so that 

IK - PNi(ui)\\ < \\ui\\ A v,texp (-r]uj~ T N[) 2 = 1,2. 

Given N > 1 , we seek N\ , N 2 so that 

N = Ni + N 2 , r/iiVJ = r/ 2 AJ . 

This implies 

N = Nirj{ (rip + r/2 7 ) = N^v'K 

and 

|| (m + u 2 ) - Pn(ui + ^2)11 < IK - Pjvi(«i)|| + IK - Pn 2 (u2)\\ 

< ||tii||_ 4 r ?1 ,texp(-?7ia; d r A f 1 r ) + |u 2 |^ 2 ,texp(-77 2 u^ r iVJ)) 

G G 

- OKIU^-* + ll u 2||_ 4 g..t) ex p( _r ? a; d TArr )- 

whence the assertion. □ 

Note that when 771 = r/ 2 we obtain 77 = 2~ T r]\ < 2~ 1 7/i thereby extending the previous 
counterexample. 

5 Sparsity classes of the residual 

For any finite index set A, let r = r(u\) be the residual produced by the Galerkin solution u\. 
Under Assumption 13. 1| the step 

<9A := DORFLER(r, 8) 

selects a set dA of minimal cardinality in A c for which ||r — i-bA r ll < n/T — # 2 ||7"||. Thus, if r 
belongs to a certain sparsity class Ai, identified by a function 4>, then (14. 3D yields 



|0A| < 0- 1 ^Vl - 6> 2 JJL )+l. (5.1) 
Explicitly, if r £ .4^ for some s > 0, we have by 



d/g 

|dA| < (1 - g 2 )-^/ 25 ( ) +1 



27 



whereas if r € JPq for some rj > and t > 0, we have by (|4.10p 



iMi<^(fo g ^ + iio g yi^ij +1. 

We stress the fact that the cardinality of <9A is related to the sparsity class of the residual. 
We will see in the rest of this section that such a class does coincide with the sparsity class 
of the solution in the algebraic case, whereas it is different (indeed, worse) in the exponential 
case. This is a crucial point to be kept in mind in the forthcoming optimality analysis of our 
algorithms. 

The cardinality of dA depends indeed on how much the sparsity measure |H|.4^ deviates 
from the Hilbert norm ||r||. So, before embarking ourselves on the study of the relationship 
between the sparsity classes of the residual and of the solution, we make some brief comments 
on the ratio between these two quantities. For shortness, we only consider the exponential case, 
although similar considerations apply to the algebraic case as well. The size of the ratio 

depends on the relative behavior of the rearranged coefficients r* of r, which by Definition 14.41 
and Proposition 14.21 satisfy 

\r*\ < aV^/V^"™" ||r|| (5.2) 

G 

for some constant A* > 0, with f = i/d. Let us consider two representative situations. 

Example 5.1 (genuinely decaying functions) The most "favorable" situation is the one in 
which the sequence of rearranged coefficients decays precisely at the rate given by the right-hand 
side of ()5.2|) : in other words, suppose that there exists a constant A* > such that for all n > 1 



Kn {f - l ^ 2 e-^^ n "\\r\\ A ^ < |r*| < AV^/V^/^ ||r|| AfjJ . (5.3) 

Then, 



(K) 2 ^n^e- 2 ^^\\r\\^ < ||r|| 2 < (A*) 2 £ n^^^ \\ r f , 
n>l ° n>\ ° 

and since 

»+oo 



n>l 

we obtain 



/+oo r+oo 



1 <Q< 1 



CA* C\* 

Thus, if (|5.3f) is a "tight" bound, the ratio Q is "small", and the procedure DORFLER activates 
a moderate number of degrees of freedom at the current iteration. □ 
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Example 5.2 (plateaux) The opposite situation, i.e., the worst scenario, occurs when the 
sequence of rearranged coefficients of r exhibits large "plateaux" consisting of equal (or nearly 
equal) elements in modulus. Fix an integer K arbitrarily large, and suppose that the K largest 
coefficients of r satisfy 

I * I I * I I * I 1*1 \*T J r(T—l)/2 — fjid~7 T K T ||_|| 

l r ll = \ r 2\ = • • • = \ r K-l\ = \ r K\ = X K y >' e ' d VWjjhi ■ 

Since 

n>K J( K + i r 
there exists S G (0, 1) such that 

^-{X*)\K + 5fe- 2 ^ Kf \\r\\ 2 An>l . 



r 



We conclude that the ratio 

Q = — r- 

^ A*(K + 5)^/ 2 

turns out to be arbitrarily large, and indeed for such a residual it is easily seen that Dorfler's 
condition ||-PdA r ll — $ll r ll requires |<9A| to be of the order of OK. □ 



Let us now investigate the sparsity classes of the residual, treating the algebraic and expo- 
nential cases separately. Note that, in view of Propositions 14. l l or 14.21 f° r studying the sparsity 
classes of certain functions v and Lv we are entitled to study, equivalently, the sparsity classes 
of the related vectors v and Av, where A is the stiffness matrix (12. 101) . 

5.1 Algebraic case 

We first recall the notion of matrix compressibility (see [7] where the concept has been used in 
the wavelet context). 

Definition 5.1 (matrix compressibility) For s* > 0, a bounded matrix A : l 2 ( r L d ) — > £ 2 (Z d ) 
is called s* -compressible if for any j€N there exist constants atj and Cj and a matrix Aj having 
at most otj2 J non-zero entries per column, such that 

\\A-A,\\<C, 

where {aj}j^fq is summable, and for any s < s* , {Cj2 s ^ d } is summable. 

Concerning the compressibility of the matrices belonging to the class T> a {r]£) of Definition 12.11 
the following result can be found in [9j Lemma 3.6]. We report here the proof for completeness. 



Lemma 5.1 (compressibility) If s* := tjl — d > 0, then any matrix A G T> a {j]i) is s*- 
compressible. 
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Proof. Let us take Nj = \jj-^[yi~\, where [•] denotes the integer part plus 1. Then by Property 
EH (algebraic case) there holds ||A - A Nj \\ < 2-i( r >L-d)/d q + 1 )2(r )i -d) = . q. and A ^ hag 
ay2 J non-vanishing entries per column with ay ~ 2 d (j + l)~ 2d . It is immediate to verify that 
Y^j oij < oo. Moreover, for s < s* and setting 5 = s* — s, we clearly have • Cp? s l A = 
£\ 2-J*/«*(j + l)2«* < oo. D 

We now consider the continuity properties of the operator L between sparsity spaces. The 
following result is well known (see e.g. [10]) and its proof is here reported for completeness. 

Proposition 5.1 (continuity of L in A S B ) Let A 6 T> a {r]i), tjl > d and s* = tjl — d. For 

any s < s* , if ' v G A S B then Lv G A S B , with 

\\Lv\\ab < IM|/1 S . 

The constants appearing in the bounds go to infinity as s approaches s* . 

Proof. Let us choose Nj = Ijrpfis'] as m the proof of Lemma r5TTT If we set Aj := Ajy,-, then by 
Property 12.41 (algebraic case) we have 

|| A - Aj\\ < 2~ j{ VL-d)/d y + ^2{ VL -d) _ 2 -js*/d q + ^28* 

On the other hand, for any j > 0, let Vj = Pj(y) be a best 2 J -term approximation of v G £ S B , 
which therefore satisfies ||v — Vj|| < 2 _ - JS / ci ||v||£s . Note that the difference Vj — Vj-i satisfies as 
well 

II v,- - v,-_i|| < 2- J ' s / d ||v|U . 

1 1 J J A H rvj llll 

Let 

J 

wj = ^A J _ i (v j -v j _i) , 

3=0 

where we set v_i = 0. Writing v = v — vj + ^/=o( v i ~~ v i-i)' we obtain 

J 

Av - wj = A(v - vj) + ^ (A - Aj-j)(vj - Vj-i) . 

3=0 

The last equation yields 

J 

||Av-wj|| < ||A||||v — vj|| + ^ || A — Aj-jHHvj — Vj— 1|| 

3=0 

< l 2 - Js / d + J22~ {J ~ j)s * /d {J-j + l) 2s *2- js/d \ ||v||^| 

\ 3=0 

< 2- Js ' d (l + Y, 2-( J -J')( s *- s )/ d (J - j + l) 2s * ) ||v||^ 

\ 3=0 



B 



B 



<- o— Js/d||„|| 
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where the series is convergent but degenerates as s approaches s*. 

Finally, by construction wj belongs to a finite dimensional space V\j, where 

J J 
|Aj| < oo d J2Nt 3 <2 J Y J (J-3 + ir 2d <2 J ■ 

3=0 3=0 

This implies ||Av||^ < || v ||£^ for any s < s* . □ 
At last, we discuss the sparsity class of the residual r = r{u\) for some Galerkin solution 

Proposition 5.2 (sparsity class of the residual) Let the assumptions of Property \2.2\ be 
satisfied, and set s* = tjl — d. For any s < s* , if u G A S B then r{u\) G A S B for any index 
set A, with 

\\ r ( u A)\\A s B < IMUf, • 
Proof. Denoting by r\ the vector representing t(ua) and using Proposition 15-H we get 

IIfaII^ = ||A(u - u A )||r R < ||u - ua[U« < ||u||^ R + ||u A ||r H . (5.4) 



At this point, we invoke the equivalent formulation of the Galerkin problem given by (|2.24p . 
which yields u = (Aa^^Pa 1 ")- Using A G T> a (r]L) and combining Property 12.51 together with 
Property 12.21 we obtain (Aa) _1 G T> a (j]i)- Hence, applying Proposition 15.11 to (Aa) -1 we get 

IIuaII^ = \\n\\e B = ||(A A )- 1 (P A f)||£| 3 < ||PAf||^ < ||f||^ , 

where the last step is an easy consequence of the definition of the projector Pa- By substituting 
the above inequality into (|5.4p . we finally obtain 



ll r A||^ ^ ll u ll^ + ¥\\i s B = ll u ll^ + l|Au||^ < ||u||^ , (5.5) 
where in the last inequality we used again Proposition 15.11 □ 

We observe that the previous bound is tailored to the "worst-scenario": one expects indeed 
that for A large enough the residual becomes progressively smaller than the solution. 

5.2 Exponential case 

As already alluded to in the Introduction, and in striking contrast to the previous algebraic case, 
the implication v G A V q =4> Lv G A V q is false. The following counter-examples prove this fact, 
and shed light on which could be the correct implication. 

Example 5.3 (Banded matrices) Fix d = 1 and t = 1 (hence, r= | = 1). Recalling the 
expression (|2.14p for the entries of A, let us choose z>o = <5"o = V2tt, which gives 

a i>£ = 1 V 

Next, let us choose &h = for all h 7^ 0, which implies (because d = 1) 

WM = —7= \ v t-k\ , « T k , 

V27T QCfc 
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i.e., ^ ^ 

\vi-k\ < W,k\ < —e= \h-k\ > ^fc) KI,|A;|>1. 



2V2tt ' V2vr 

At this point, let us fix a real r/£ > and an integer j? > 0, and let us choose the coefficients 
for h / to satisfy 

\V2^e~ r i^ h \ if0<|fc|<p, 
)0 ii\h\>p. 



\ v h\ 



In summary, the coefficient v of the elliptic operator L is a trigonometric polynomial of degree 
p, whereas the coefficient a is a constant. The corresponding stiffness matrix A is banded with 
2p + 1 non-zero diagonals, and satisfies 

i e -^- fe l < \ aejk \ < e -^- fc l , 0<|£-A;|<p, |£|,|fc|>l. (5.6) 

In order to define the vector v, let us introduce the function t : N* — > N*, i{n) = 2(p+ l)n. 
Let us fix a real r] > and let us define the components (v)& = v k of the vector in such a way 
that 

{e~2 n if k = i(n) for some n > 1 , 
otherwise . 



I(v) 



Thus, the rearranged components (v)* satisfy |(v)*| = e~z n , n > 1, whence v G ^^(Z) (or, 
equivalently, v G -^q 1 )' with || v ll^.i( Z ) = 1 5 according to Definition 14.41 

The definition of the mapping i and the banded structure of A imply that the only non-zero 
components of Av are those of indices i{n) + q for some n > 1 and q G [— p,p\. For these 
components one has 

( Av )t(n)+g = a t (n)+g,t(n)( v )t(n) j 

thus, recalling (|5.6p . we easily obtain 



i e -^e-i" < |(Av) t(n)+g | < e-i" , q G [-p,p] . (5.7) 
This shows that, for any integer N > 1, 

#{£ : |(Av)/| > i e -^ p e^^ } > (2p + 1)N , 

hence 

l( Av )(2 P +i)ivl ^ {2p+l)N > ie-^e^ -> +oo as AT -> +oo , 

i.e., Av g" ^^(Z) (or, equivalently, Li> ■^q') regardless of the relative values of t]l and r\. 

On the other hand, let m p be the smallest integer such that ie _r?iP > e~2 m p. Given any 
m > 1, let N > 1 and Q G [— p,p] be such that (Av),^ = (Avj^jyj+Q, which combined with 
(I5TD yields 

e-Kiv+m,) <|(Av)^|<e-i iV . 

The rightmost inequality in (|5.7p . namely \(Av)^ N+nip ^ +q \ < e~2( Ar + m p) ) shows that there are 
at most (2p + 1)(N + m p ) components of Av that are larger than e~ !(-^+ m p) in modulus. This 
implies m < (2p + 1)(-/V + m p ), whence 

4 N 



e -2 JV < e a p e 2 (2p+ 1 ) 



32 



Setting r\ = > we conclude that Av G P ] q(JL) (or, equivalently, Lv G A^ 1 ), with 



l|Av||^,i (z) <e^ P ||v|| 4>1(z) . 
Therefore, the sparsity class of Av deteriorates from for v to £q 1 (Z,) with fj = 2 p+i ■ ^ 

Next counter-example shows that, when the stiffness matrix A is not banded, in order to 
have Av G (Z) it is not enough to choose some fj < rj as above, but a choice of i < t is 
mandatory. 

Example 5.4 (Dense matrices) Let us take again d = t = 1 and modify the setting of the 
previous example, by assuming now that the coefficients satisfy 

\v h \ = v / 2^e- r?i|h| for all \h\ > , 
so that A is no longer banded, and its elements satisfy 

l e -v L \e-k\ < | ^ fc | < e -m.\l-k\ for all ^ \k\>l. (5.8) 

If M > is an arbitrary integer, we now construct a vector v M = Yln>i^ M ' n W1 ^h g a P s 
of size A(M) > M between consecutive non-vanishing entries. To this end, we introduce the 
function lm '■ — >• N* defined as tM( n ) '•= A(M)n and the vectors v M,n with components 

|(v M '") fe |=e-^«5 Wn) , keZ. 
From (|5.8h and the fact that only the lm {n)-th entry of v M ' n does not vanish, we obtain 

i e -VL\t-LH(n)\ e -%n < |( Av M ' n ) £ | < e-^l'-^W'e - ^ . (5.9) 

As in Example 15.31 it is obvious that ~v M G £q (Z) with ||v"^ H^i. 1 ^) = 1- However, we will prove 
below that ||Av A:f \\ fj,t < ||v cannot hold uniformly in M for any > and i > 1/2. 

We start by examining the cardinality j^J- n of the set 

F n :={l^L: |(Av M '"),| >e"i M } 

In view of (|5.9p . the condition |(Av M ' n )£| > e~5 M is satisfied by those £ = tjvf(n) + m such that 

< Iml < — (M - n) , 
2t/l 



whence n < M and #.F n > — n) + 1. We now claim that 

M 

C Af :=#{£: |(Av A/ ),|>e-l M }>^#J-„, (5.10) 

n=l 

whose proof we postpone. Assuming (|5.1U[) we see that 

M 



Cm > V ( ^-{M - n) + l) ~ -^-M 2 
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or equivalently there are about Nm 

This implies that the iVjvf-th rearranged coefficient of Av M satisfies 



^ M2 



coefficients of v M with values at least e 2 M . 



\(Av M y N J > e"i M > e-KW 2 < /2 for aU M > 1 . 
This proves that for any fj > and t > ^, one has 

l|Av M || 4 , f(z) > |(Av M )*v M | ef< > el^M"^ 2 ''^) 172 ^ 2 ^ +00 as M -> 00 , 
whence the following bound cannot be valid 

H Av ll4''(Z) £ IMI^Z) > fOT a11 V € ^g( Z ) • 

It remains to prove (I5.10p . We first note that the sets T n are disjoint provided tj^(n + 1) — 
iM (n) = A(M) > ^-M. We next set 

e A f := min min |(Av M ' n V| - e" 2 M > 

l<n<M e&T„ 

which is a constant only dependent on M. We observe that for every £ S J- n , there holds 

|(Av M )d > |(Av M <™),| - I £(Av*")<| > e"i M +e M - £ |(Av M ^|. (5.11) 

PT^ra P^" 

We write £ £ T n as £ = i-A/(n) + m, make use of (|5.9p and the definition of iM(n) = X(M)n to 
deduce 

^|(Av M ' p ) £ | < Ve"^"^^"^ < ^ e -VL\m+\{M)(n-p)\ < y^ e -i 7l ,(A(A0|n-p|-|m|) i 
p^n PT^" P7^ n P^"- 

Since \m\ < ^-M, the above inequality gives 

p^n <J>1 <?>1 

Combining (|5.11|) and (|5.12|) yields 

|(Av M ),| > e"i M + eu ~ 2e^ M £ e "^ A ( M )^ . 

By choosing A(M) sufficiently large, the last term on the right-hand side of the above inequality 
can be made arbitrarily small, in particular < em- We thus get |(Av M )£| > e~z M and prove 
(1530]) . □ 



Guided by Examples 15.31 and 15.4} we are ready to state the main result of this section. We 
define 

C(t) ■■= C-H)^ V0<t<i (5.13) 
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Proposition 5.3 (continuity of L in A T q) Let the differential operator L be such that the 
corresponding stiffness matrix satisfies A G T> e (r)L) for some constant t]l > 0. Assume that 
v G Aq 1 for some n > and t G (0, d] . Let one of the two following set of conditions be satisfied. 

(a) // the matrix A is banded with 2p + 1 non-zero diagonals, let us set 

(b) If the matrix A is dense, but the coefficients tjl and n satisfy the inequality n < vluj^, let 
us set 

fj = ((t)ri, t=Y^l' 

Then, one has Lv G A^f, with 

Proof. We adapt to our situation the technique introduced in [7j. Let Lj (J > 0) be the 
differential operator obtained by truncating the Fourier expansion of the coefficients of L to the 
modes k satisfying \k\ < J. Equivalently, Lj is the operator whose stiffness matrix Aj is defined 
in (|2.22p ; thus, by Property 12.41 (exponential case) we have 

\\L -Lj\\ = || A - Aj|| < C A (J + l) d - l e-^ J . 

On the other hand, for any j > 1, let Vj = Pj(v) be a best j-term approximation of v (with 
vq = 0), which therefore satisfies ||u — Vj\\ < e -7 ^ "^INIyf-*! with r = t/d. Note that the 
difference Vj — Vj-\ consists of a single Fourier mode and satisfies as well 

K--"i-l|| £ ^^ TjT \H A nf ■ 

Finally, let us introduce the function x '■ N — > N defined as xti) = \j Tl \ j the smallest integer 
larger than or equal to j T . 

For any J > 1, let wj be the approximation of Lv defined as 

J 

W J = Yj L x(J-j)( v i • 

Writing v = v — vj + Ylj=i( v j ~ v j-i)' we obtain 

J 

Lv-wj = L(v - vj) + ^(L - L x{J _ j) ){v j - . 

i=i 

We now assume to be in Case (b). Since L : l 2 {JL d ) — > £ 2 (Z d ) is continuous, the last equation 
yields 

\\Lv-wj\\ < L-^ r J T + J2(l(j-j)^ +l) d -V^^-in+^ri T )j l^ll^ . (5.15) 
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The exponents of the addends can be bounded from below as follows because r < 1 

VL\(J-j)l+V^ T f = r ]L \(J-jY]-r ] u d T (J-jy+r ] 0J2 T ((J-j) T +f) 

> Vl(J - j f - V^ T (J - 3) T + V^ T ((J ~ J) + if 

with f3 = t]l — f]^d T > by assumption. Then, ()5. 15j) yields 

j-i 



\Lv - wj\\ < + (fTI + l)eV T e-^ TjT \\v\\ Ar < e~^ jr \\v\\^ . (5.16) 

3=0 



On the other hand, by construction wj belongs to a finite dimensional space V\j, where 

id u d T i +t 



V7 ) 

J J-l 



iajI < co d j2x(j - jf = ^Em" ~ YTi Jl+t as J ^ 00 • ( 5 - 17 ) 



j=l j=0 

This implies 



with r = = -jjj^ and rj = ^ J n = ((t)r/ as asserted. 

We last consider Case (a). One has L x rj_j\ = L if x{J ~ j) > Pj whence if j < J — then 
the summation in (|5.15p can be limited to those j satisfying j p < j < J, where j p = \J — p 1 / 1 "] . 
Therefore 



v\\ A v,t ■ 



\\Lv-wj\\ < [ e^ TjT + max [(J-jOH^- 1 J^e-^V 

Now, J — j < p 1 / 1 " if j p < j < J and j T > jp > ( J — p 1 ^ T ) T > J T — p, whence 
||^-^j|| < ( 1 + p d - 1+1 / T e"^"M e-" w r 



We conclude by observing that |Aj| < (2p + 1)J, since any matrix Aj has at most 2p + 1 
diagonals. □ 



Finally, we discuss the sparsity class of the residual r = r(u\) for any Galerkin solution u\. 

Proposition 5.4 (sparsity class of the residual) Let A G T> e {r\L) and A -1 £ T> e {f)L), for 
constants tjl > and r/x G (0, r/x] according to Property \2. 31 and let 1 < d < 10. If u £ Aq* for 
some rj > and f G (0, d], such that rj < w */ d ( 1+2t ' ] fj L) then there exist suitable positive constants 
fj < r\ and i < t such that r{uh) G A^f for any index set A, with 
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Proof. We first remark that the hypothesis 1 < d < 10 guarantees > 2 (see e.g. [15, Corollary 
2.55]); this implies r < u: r d for any r > 0, whence the function ( introduced in f)5. 13|) satisfies 
< 1 for any t > 0. Assume for the moment we are given fj and t. By using Proposition 15.31 
and Lemma 14.11 we get 

ll r A|Uf= ll A ( u - u a)|U* ^ ||u-u A |Li,ti < ||u|| 2 n t + \\-a A \\ on t (5.18) 

l G G G t Q l G 

where, f = t/d, t\ =t\/d and the following relations hold 

1 + ti 

From (|2.24p we have ua = (AA) _1 (PAf). Using Property 12.51 and applying Proposition 15.31 to 
(A A ) _1 we get 

ll u A|L2n^, tl = ||(A A )" 1 (PAf)|| 2n, ;i , tl < ||PAf||^2.*2 < ||f[U 2 >*a , 

£q *-q G G 

with 

2 T1 r?i = C(t2)m <V2, h = < t 2 . 

By substituting the above inequality into ()5. 18[) and using again Proposition 15.31 we get 

H r A|U,f < llull^n^i.t! + [|f|Li2.*2 = 1 1 u l Lan^.tj + ||Au|| ,, 2 ,t 2 < ||u|L,,,t (5.19) 

G <-G G *G G G 

where 

m = C(t)v < v , h = < 1 ■ 

This shows that the thesis holds true for the choice 

i \2) s Vl + 2t/ s Vl + t7 sw " l + 3t 

It remains to verify the assumptions of Proposition 15.31 when A is dense. Since > 2 and 

tl = TT2t <t2 = TTt <t ^ 

we have wj 1 < wj 2 < wj. Moreover, using r/i < 2 n r/i < 7/2 < 77 and rji > fj^ > ^ T1 rj yields 

which are the required conditions to apply Proposition 15.31 when A is dense. This concludes the 
proof. □ 



Remark 5.1 (definition of loj) The limitation 1 < d < 10 stems from the fact that the 
measure of the unit Euclidean ball ujd in M. d monotonically decreases to as d — > 00. To avoid 
such a restriction, one could modify the definition of the Gevrey classes Gp*(Jl) given in (|4.7p . 
by replacing the Euclidean norm \k\ = \\k\\2 appearing in the exponential by the maximum norm 
II&IIoq. Consequently, throughout the rest of the paper uj^ would be replaced by the quantity 2 d , 
strictly larger than 1 for any d. □ 
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6 Coarsening 



We start by considering an example that sheds light on the role of coarsening for the exponential 
case. We then state and prove a seemingly new coarsening result, which is valid for both classes. 

6.1 Example of coarsening 

Let a, b G R p for p > 1 be the vectors 



a := (!,(),••• ,0), 



-CM,"- ,1)- 
P 



Let v, z be the sequences defined by 



v := fe-^a 



\ oo 
Jfc=0' 



z := 



(e-" fc b) 



oo 

fc=0" 



We first observe that 



I Il2 II \\2 

\v\\ = p\\z\\ 



1 



l V H^' 1 (2) -P\\ z \\^/pA { 



(recall that oj^ = 2 for d = 1). Given a parameter e < 1, we now construct a perturbation w of 
v which is much less sparse than v by simply scaling z and adding it to v (see Fig. [2] (a)): 



w 



:=V + £Z= (e-^Ca + eb))^ . 



(a) 



i 



(b) 



Figure 2: Pictorial representation of (a) the components of the vector w = v + ez and (b) its 
rearrangement w*. It turns out that w* exhibits the decay rate e~ kri of v up to a level of 
accuracy ||w — v|| in £ 2 (Z) but a worse decay rate e k ~p of z for smaller tolerances. Therefore, 
truncating w* with a threshold 8 > ||w — v|| captures the behavior of v. 

The first task is to compute the norms of w. We obviously have ||w|| ~ ||v||. To determine 
the weak quasi- norm of w we need to find the rearrangement w* (see Fig. [2] (b)). Let n\ be the 
smallest integer such that 



P 



1 + - e-" ni > -e - " > 1 + 



P 



-??(ni+l) 
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namely the index corresponding to the first crossing of the exponential curve e _r?n dictating the 
behavior of the first portion of the rearranged sequence w* (which coincides with the behavior 
of v*), and the first plateaux of z. This implies 

- log ( 1 + ?■) < m < 1 + - log fl + £ 

T) \ EJ 7] \ £ 

Next, let ti-2 be the smallest integer such that 

( i + i) e -nr» > £ _ e -2n > ( 1+ £^-^+1) 
\ pJ p v pJ 

which corresponds to the beginning of a number of decreasing exponentials preceeding the second 
plateaux of w*. This implies 

1 + - log f 1 + -) < n 2 < 2 + - log f 1 + - 

T] \ eJ T] \ £ 

and shows that n 2 — n\ = 1, and that there is exactly one exponential between the first and 
second plateaux. Iterating this argument, we see that the difference between two consecutive 
raj's is just 1, and that there is exactly one exponential between two consecutive plateaux (see 
Fig [2(b)). 

We are now ready to compute the weak quasi-norm of w. Let v\. denote the index correspond- 
ing to the end of the fc-th plateaux of w, which in turn corresponds to the value w* k = e~' qk . 
Then 

1 / p 
i/fc = pk + rai ~ pk H — log 1 H — 
r? V £ 

To determine the class of w, we seek A so that w G ^^(Z), namely 

sup (e Xuk/2 e- vk ) <oo O -Xpk- V k<0 A<^. 
k>o v / 2 p 

We thus realize that w G £^ P,1 (Z) belongs to a sparsity class much worse than that of v, that 
deteriorates as the size p of the plateaux tends to oo. On the other hand, we note that the 
restrictions w*^ n i = v*^ , coincide, thereby showing that the decay rate of the first part of 
w* is the same as that of v* (see Fig[2]Jh)). This example explains the need to coarsen the 
vector w starting at latest at rai, to eliminate the tail of w* which decays with rate 2rj/p instead 
of the optimal rate 2-q of v. 

In addition, we observe that the best ni-term approximation of w satisfies 

E ° -2kv fc II Il2 2n ||2 

P— e v = — —= v-w =e z , 
pz n l-e zr i 

k=0 

which is precisely the size of the perturbation error of v. Given an error tolerance 5 > e||z||, the 
best iV-term approximation Wjv of w satisfying ||w — wjv|| < 5 would require 

..1.1 2, IMI^cz) 

N ~ - log - = — log . 

rj o 2r] o 
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6.2 New coarsening Result 



We extract the following lesson from the example of Sect. I6.lt for as long as we deal with the 
first part of w*, which has a decay rate e~ kv dictated by that of v*, we could coarsen w and 
obtain an approximation of both w and v with the decay rate e~ kr] of v. This requires limiting 
the accuracy to size ||v — w|| since a smaller accuracy utilizes the tail of w which has a slower 
decay e p . 

We express this heuristics in the following theorem, which goes back to Cohen, Dahmen, 
and DeVore [7J. However, our proof is much more elementary and the statement much more 
precise. Although the result holds for the general setting of Sect. 14.11 we just present it for the 
exponential case, since it will be used only in this situation. 

Theorem 6.1 (coarsening) Let e > and let v £ A^q and w £ V be so that 

\\v — w\\ < e. 

Let N = N(e) be the smallest integer such that the best N -terra approximation wn of w satisfies 

\\w — wn\\ < 2e. 

Then, \\v — wn\\ < 3e and 

N< Ud 

Proof. Let A £ be the set of indices corresponding to the best approximation of v with accuracy 
e. So A e is a minimal set with properties 




PaM\<s, |A e |<^flog^) +1- 



If z = w — v, then 



||«; - Pa.^11 < \\(v + z) - P Ae (v + z)\\ = \\(v-P Ae v) + (z-P A£ z)\\ 
< \\v - P\ E v\\ + \\ z ~ Pa.z\\ <e + \\z\\ < 2e , 

because / — P\ e is the projector onto V Z d^ £ . Since is the cardinality of the smallest set 
satisfying the above relation, we deduce that N < |A e |. This concludes the proof. □ 



7 Optimality properties of adaptive algorithms: algebraic case 

The rest of the paper will be devoted to investigating complexity issues for the sequence of 
approximations u n = u\ n generated by any of the adaptive algorithms presented in Sect. [3l 
In particular, we wish to estimate the cardinality of each A n and check whether its growth is 
"optimal" with respect to the sparsity class of the exact solution, in the sense that |A n | is 
comparable to the cardinality of the index set of the best approximation of u yielding the same 
error \\u — u n \\. 

The algebraic case will be dealt with in the present section, whereas the exponential case will 
be analyzed in the next one. The two cases differ in that no coarsening is needed for optimality 
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in the former case, whereas we will prove optimality in the latter case only for the algorithms 
that incorporate a coarsening step. The reason of such a difference can be attributed, on the 
one hand, to the slower growth of the activated degrees of freedom in the exponential case as 
opposed to the algebraic case and, on the other hand, to the discrepancy in the sparsity classes 
of the residuals and the solution in the exponential case, discussed in Sect. 15.21 

7.1 ADFOUR with moderate Dorfler marking 

The approach followed in the sequel, which has been proposed in |16| in the wavelet framework 
and adopted in |2H [U] in the finite-element framework, allows us to prove the optimality of the 
algorithm in the algebraic case, provided Dorfler marking is not too aggressive. 
The two following lemmas will be useful in the subsequent analysis. 

Lemma 7.1 (localized a posteriori upper bound) Let A C A* C Z d be nonempty subsets 
of indices. Let ua G and ua, G V\ t be the Galerkin approximations of Problem \2.J$. Then 



< — y~] I^Oa)! 2 = — rj 2 (uA, A* 



I MA. - ""All' 2 

a * , *7*[ « ' ' a * 



Proof. One has 

I^A, - u a\\ 2 = a(uA t -u\,ua, -ua) = (f,UA, -«a)-o(«A)WA, ~«a) = ^( u a)(«A« - %)fc 

fceA, 

because A* is the support of ua* — ua- The asserted result follows immediately by the Cauchy- 
Schwarz inequality, upon recalling that fk(uA) = for all k G A. □ 

Lemma 7.2 (Dorfler property) Let A C A, C ^ be nonempty subsets of indices. Let ua G 
Va and ua, G Va» be the Galerkin approximations of Problem \2.J$ . Let the marking parameter 
9 satisfies 6 G (0, 0*), where 6** = and set fig = 1 - ^9 2 > 0. If 

HI t* — ua, ||| 2 < n\\u — ua\\ 2 , 
for some u G (0, then A* fulfils Dorfer's condition, i.e., 

»7(«A,A*) > e v (u A ) . 

Proof. Since u — ua, 1 ma - ua, in the energy norm because of Pythagoras, the assumption 
yields 

HI -ZX — III 2 = \\u — Ua, 1 2 + 1 MA* — ""All 2 < /J>\\u — Ua\\ 2 + \\ua, — Ua\\ 2 ■ 
Invoking the lower bound in (|3.2|) gives 

1 ^A. - MA I 2 > 

whence applying Lemma 17. II implies 

V 2 (ua, A,) > (1 - u^tAua) > (1 - ^)^ 2 (u A ) = #V (u A ). 
a a 

This concludes the proof. □ 

We are ready to estimate the growth of degrees of freedom generated by the algorithm 
ADFOUR of Sect. 13.11 For the moment, we place ourselves in the abstract framework of Sect. 
14.11 only the final result being specifically for the algebraic case. 



"A, - ma||| 2 > (1 - aO|m - ua||| 2 > (1 - u)^-7/ 2 (ua) 

cr 
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Proposition 7.1 (cardinality of dA n ) Let 9 satisfy the condition stated in Lemma \ 7. M and 
let fi € (0, fig] be fixed. Let {A n , u n } n >o be the sequence generated by the adaptive algorithm 
ADFOUR, and set e\ = fi\\u — ti n ||| 2 . If the solution u belongs to the sparsity class A<^, then 



dk n \ = \k n + 1 \-\k n \<K<fr 1 [^fi^\ , Vn>0, (7.1 



Ma* 



where k > 1 is the constant in (|4.4p . 

Proof. Let e = e n and make use of (|4.4p for u G As- there exists A e and w £ € V\ E such that 



\\u — w £ \\ 2 < e 2 and |AJ < K<fi 1 



s 



Let A* = A n U A e be the overlay of the two index sets, and let u* 6 Va 4 be the Galerkin 
approximation of Problem (|2.4p . Then, since Va e C Va„, we have 

III III 2 ^ in in 2 ^ in in 2 

%U — 1i* I < |||W — W £ \l <// 1 U — ti n I . 

Thus, we are entitled to apply Lemma [7T21 to A n and A*, yielding 

t?(u n) A*) > 6»7?(u n ) . 

By the minimality property of the cardinality of A„, + i among all sets satisfying Dorfler property 
for u n (Assumption 13.11) . we deduce that |A n+ i| < |A*| < |A n | + |A e |, i.e., 

|A n+ i| - |A n | < |A e | , (7.2) 

whence the result. □ 



Corollary 7.1 (cardinality of A n : general case) Let the assumptions of Proposition 7.1 be 
valid and p = yl — ^fO^ be given by \3. 7[ ). Then 



n-l 

£r. 



K\<^Y,^ \^~ n TT~ ' Vn>0. (7.3) 



k=0 v 11 

Proof. Recalling that |Ao| =0, the previous proposition yields 

n— 1 n— 1 



|A„|=^|M fc |<K^^(^- 



fc=0 fe=0 V " 

On the other hand, by Theorem 13.11 one has 

e n = yfji\\u-u n I < ^Jl P n ~ k \\u- u k \\ =p n - k e k V0<Kn-l, (7.4) 
and we conclude recalling the monotonicity of <p. □ 



At this point, we assume to be in the algebraic case, i.e. ti E A S B for some s > 0. Then, (17. 3p 
reads 



n— 1 , 

rt—k 



\K\<K^ d,2s \\u-u n \\- d ' s \\u\\ d ^^(^ P d ' s j , Vn>0. 
Summing-up the geometric series and using (|2.5|) . we arrive at the following result. 
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Theorem 7.1 (cardinality of A n : algebraic case) Under the assumptions of Proposition 7.1 



the growth of the active degrees of freedom produced by ADFOUR in the algebraic case is esti- 
mated as follows: 

|AJ < C* \\u - Unir^lullifi' , V n > , 



where the constant C* depends only on a*, \i and p. 



This result is "optimal" in that the number of active degrees of freedom is governed, up 
to a multiplicative constant, by the same law (I4.4l) - (l4,5j) as for the best approximation of u. 
The optimality of this result is related to the "sufficiently fast" growth of the active degrees of 
freedom: the increment of degrees of freedom at each interation may be comparable to the total 
number of previously activated degrees of freedom (geometric growth). 

7.2 A- ADFOUR: Aggressive ADFOUR 

We now examine Algorithm A- ADFOUR, defined in Sect. 13.31 which allows the choice of the 
parameter 9 as close to 1 as desired. Such a feature is in the spirit of high regularity, or equiv- 
alently a large value of s for u E A S B . This is a novel approach which combines the contraction 
property in Theorem 13.31 and the key property of uniform boundedness of the residuals stated 
in Proposition 15.21 

Theorem 7.2 (cardinality of A n for A- ADFOUR) Let the assumptions of Propertv \2.2\ and 
Theorem \3.3\ be fulfilled, and let u G A S B for some s > 0. Then, the growth of the active degrees 
of freedom produced by A- ADFOUR is estimated as follows: 

|AJ < C* J d \\u - u n \\~ d/s \\uft , V n > . 

•^B 

Here, J is the (6-dependent) input parameter of ENRICH, whereas the constant C* is inde- 
pendent of 9. 



Proof. At each iteration n, the set dA n selected by DORFLER is minimal, hence by ([3? 
P~3j) and (PD2), one has 



\dK\< (Vl-0 2 \\r n \\) d/S \\r n \\%l + 1. 



Using (12. 9p and Proposition 15. 2\ this bound becomes 



|3A n | < (Vi-e 2 \tu-uj) d/s \\ u \\ d/s 



On the other hand, estimate (|3.18p for the procedure ENRICH yields 



-d/s 



|M n | < J d (Vl-0 2 \\u-u„ // 

Now, as in the proof of Corollary 17. 1\ 



fn-l 

\K\ < J d (l - e 2 )- d/s ( E II" " u ^~ d/s ) W U W% S ■ ( 7 - 5 ) 



\k=0 
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Is \\u-u n \\- d ' s 



The contraction property of Theorem 13,31 yields for < k < n — 1 

\\U ~ U n \\ < P n ~ k \\u - life 1 , 

with p = C Vl - 2 < 1 (seeEUD; thus, 
n— 1 n— 1 

Ei M -^i" d/s ^i«-^i" d/s E^ (n_fc) s p"i«nir d/s < (i-^ 2 r 

fc=0 fe=0 

Substituting into (|7,5p . the powers of 1 — 8 2 cancel out, and the asserted estimate follows. □ 

8 Optimality properties of adaptive algorithms: exponential 

case 

From now on, let us assume that u G J<q for some i] > and t G (0, d]. Let us first observe 
that none of the arguments that led to the complexity estimates of the previous section can be 
extended to the present situation. 

For ADFOUR with moderate Dorfler marking, Corollary EH in which c/> -1 is replaced by its 
logarithmic expression yields a bound for |A n | which is at least n times larger than the optimal 
bound 

/ II II \ d l l 

best, / „ U d iPlUg* \ 



for the given accuracy e n (see the proof of Proposition lSTTI for more details, in a similar situation). 
Manifestedly, the first cause of non-optimality is the crude bound (17. 2p . which in this case is no 
longer absorbed by the summation of a geometric series as in the algebraic case. 

On the other hand, for A- ADFOUR a sharp estimate of the increment \dA n \ is indeed used 
in the proof of Theorem 17. 2} but this involves the sparsity class of the residual, which in the 
exponential case may be different from that of the solution, as discussed in Sect. 15.21 

Incorporating a coarsening step in the algorithms allows us to avoid, at least in part, these 
drawbacks. For these reasons, herafter we investigate the optimality properties of the two 
algorithms with coarsening presented in Sect. [3] 

8.1 C- ADFOUR: ADFOUR with coarsening 

Let us now discuss the complexity of Algorithm C-ADFOUR, defined in Sect. 13.41 The 
following optimal result holds. 

Theorem 8.1 (cardinality of A n for C-ADFOUR) Assume that the solution u belongs to 
A*q ' , for some r\ > and t G (0, d]. Then, there exists a constant C > 1 such that the cardinality 
of the set A n of the active degrees of freedom produced by C-ADFOUR satisfies the bound 

uj, ( NL">* \ d/t 

\K\<k-± log I. +logC , Vn>0. (8.1) 

V \ \\ u u n \\ I 
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Proof. Since each Galerkin approximation u n +i comes just after a call A n+ i := COARSE(u n> fe + i, e n ) 

— 1/2 

with threshold e n = a* ||^n,fc+i|| > ||w — ^n,A;+i||> Theorem 16.11 yields 

(II II \ d l t 

On the other hand, (|2.5p and Property 13. II yield 

||u - n n+1 || < a^^lii - n n+1 ||| < 3(Q*/a*) 1 / 2 e n . (8.2) 
Since n > — 1, this gives the result, up to a shift in the index. □ 



Next, we investigate the optimality of each inner loop. We already know from Theorem | 
that the number K n of inner iterations is bounded independently of n. So, we just estimate 
the growth of degrees of freedom when going from k to k + 1. We only consider the case of a 
moderate Dorfler marking, i.e., we subject 9 to the condition stated in Lemma 17.21 (since the 
case of 9 close to 1 will be covered in the next subsection). The following result holds. 

Proposition 8.1 (cardinality of A nj fc for C-ADFOUR) Assume that u G Xq for some 
r] > and t £ (0, d], and that the marking parameter satisfies 9 £ (0,9*), where 9* = v/|pF- 
Then, there exist constants C > 1 and fj 6 (0, rj\ such that, for all n > and all k = 1, ... , K n , 
one has 

\ d /t 



I A,,/, | < K-^r ( log ' - ^ + bgC 



Proof. Each inner loop of C-ADFOUR can be viewed as a truncated version of ADFOUR; 
hence, the analysis of this algorithm given in Sect. 17.11 can be adapted to the exponential case. 
In particular, for each increment dA n j of degrees of freedom, Proposition 17.11 gives 

d/t 



\0h n ,j < ( log ) + J. . V ()£./£ /v„. 



\U\\ .-q.t 
£< 

if' 1 \ ' £ n,j 

Since, £ n ,K n < P Kn ~ j ^n,j by (|7.4p . it follows that 



(lltill t \ ' 

log Ji^si + (i ^ n _ i}! logp! ] +1 . 



Thus, recalling that t < d by assumption, we have 

fe-i 

\K,k\ t/d < \A n \^ d + Y,\dAnj\ t/d 



3=0 



t/d 



Ll? / 1 1 1 1 A > ' " 

< | An |^ + K ^_ Uiog ^L + o^ji^ | 

»7 V £ n,K n J 

Combining (I3.23p . (|8.ip . and (|8.2p with A; < K n < 1, we conclude the assertion with fj < 
V /(1 + K n ). ^ □ 

We remark that the previous result provides a complexity bound, relative to the sparsity 
class A!q of the solution, which is optimal with respect to the index t, but suboptimal with 
respect to the index fj < rj. 



45 



8.2 PC-ADFOUR: Predictor/Corrector ADFOUR 

At last, we discuss the optimality of Algorithm PC-ADFOUR, presented in the second part 
of Sect. El 



Theorem 8.2 (cardinality of PC-ADFOUR) Suppose that u £ A V q , for some r] > and 
t £ (0, d] . Then, there exists a constant C > 1 such that the cardinality of the set A n of the 
active degrees of freedom produced by PC-ADFOUR satisfies the bound 

(\\u\\ t 
log n - +logC) , Vn>0. 
\\u-u n \\ J 

If, in addition, the assumptions of Proposition \5.4\ are satisfied, then the cardinality of the 
intermediate sets A n+ i activated in the predictor step can be estimated as 

UJ 2 ( IMI A*'* \ d / t 

|An+i| < |An| + ^ d ^m log || || +|log y/l-0 2 \+ logC , Vn>0, 

Tf ' \ 1 1 It U n 1 1 / 

where J is the input parameter of ENRICH, and fj < rj, i < t are the parameters which occur 
in the thesis of Proposition \5.4\ 

Proof. The proof of the first bound is the same as that of Theorem 18.11 Concerning the 
second bound, we invoke Proposition 15.41 to write r n £ and recall that \\r n — Pr^r n || < 

(1 — ^ 2 ) 1//2 ||r n || for each iteration n. This, combined with the minimality of the set dA n selected 
by DORFLER, yields 

d A n < 



fjd/t 

Estimate (IXT8]) for ENRICH yields 




\dA n \ <Kj a -±j log 



d. d I G 



d/t 



fjd/t I b 



521 



r. 



Using (|2.8p and Proposition 15.41 this time to replace r n by u and u — u n , we obtain the desired 
result. □ 

We observe that in the case fj < r\ and t < t, the cardinalities |A ra+ i| and |A n | are not 
bounded by comparable quantities. This looks like a non-optimal result, yet it appears to be 
intimately related to the fact that in general the residuals belongs to a worse sparsity class than 
the solution. 
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